KR102635657B1

KR102635657B1 - Segway Robot Based on Reinforcement Learning and Its Method for Motion Control

Info

Publication number: KR102635657B1
Application number: KR1020210173945A
Authority: KR
Inventors: 유영우
Original assignee: 현대오토에버 주식회사
Priority date: 2021-12-07
Filing date: 2021-12-07
Publication date: 2024-02-08
Anticipated expiration: 2041-12-07
Also published as: KR20230085609A

Abstract

본 발명의 일 실시예는, 강화학습 기반의 로봇 주행상태 제어 시스템에 있어서, 상기 로봇의 IMU 센서가 측정한 관성데이터 및 상기 로봇의 인코더가 측정한 모터의 회전데이터를 포함하는 주행데이터를 수집하는 주행데이터 수집부; 상기 주행정보 수집부에서 획득한 상기 주행데이터로 로봇상태값을 계산하고, 타겟로봇상태값와 비교하는 강화학습을 수행하는 제어로직 연산부; 및 상기 제어로직 연산부에서 획득한 상기 로봇상태값을 기초로 로봇의 주행조건을 제어하는 로봇제어신호를 생성하는 로봇주행 제어부를 포함하는, 로봇 주행상태 제어 시스템을 제공할 수 있다.One embodiment of the present invention is a reinforcement learning-based robot driving state control system, which collects driving data including inertial data measured by the IMU sensor of the robot and rotation data of the motor measured by the encoder of the robot. Driving data collection unit; A control logic operation unit that calculates a robot state value using the driving data obtained from the driving information collection unit and performs reinforcement learning by comparing it with a target robot state value; A robot driving state control system can be provided, including a robot driving control unit that generates a robot control signal that controls the driving conditions of the robot based on the robot state value obtained from the control logic calculation unit.

Description

Segway robot based on reinforcement learning and its motion control method {Segway Robot Based on Reinforcement Learning and Its Method for Motion Control}

본 실시예는 강화학습 기반의 세그웨이형 로봇 및 이의 동작 제어방법에 관한 것으로, 보다 구체적으로 딥러닝 강화학습을 적용한 세그웨이형 로봇의 셀프 밸런싱과 자율주행을 위한 동작 제어방법에 관한 것이다.This embodiment relates to a reinforcement learning-based Segway-type robot and its motion control method. More specifically, it relates to a motion control method for self-balancing and autonomous driving of a Segway-type robot applying deep learning reinforcement learning.

세그웨이(Segway)는 전기적 모터를 이용하여 이용자를 운송하는 운송수단으로서, 차세대 모빌리티로서 내연기관을 대체하기 위한 운송수단으로 주목받고 있다.Segway is a means of transport that uses an electric motor to transport users, and is attracting attention as a means of transport to replace the internal combustion engine as the next-generation mobility.

세그웨이는 종래의 사륜 구동방식의 운송수단들과 달리 한 개 또는 두 개의 바퀴를 사용하고 있으며, 모빌리티의 주행 환경의 변화에 운전자의 안전이 크게 영향을 받게 된다. 예를 들어, 무게 중심의 변화 또는 바퀴의 스핀 등으로 인해 주행조건이 변경되는 경우에, 세그웨이 스스로 학습하여 최적의 주행조건을 탐색할 필요성이 있다.Unlike conventional four-wheel drive vehicles, Segway uses one or two wheels, and the driver's safety is greatly affected by changes in the mobility driving environment. For example, when driving conditions change due to a change in the center of gravity or wheel spin, there is a need for Segway to learn on its own to search for optimal driving conditions.

특히, 세그웨이의 바퀴의 미끄러짐(스핀)이 발생하는 경우의 주행조건과 일반적인 상황의 주행조건의 차이가 발생하게 되므로 이러한 주행 특성을 적절하게 반영할 필요가 있다.In particular, since there is a difference between the driving conditions when slippage (spin) of the Segway's wheels occurs and the driving conditions under normal circumstances, it is necessary to appropriately reflect these driving characteristics.

종래의 세그웨이형 로봇들의 주행 과정에서는 로봇을 물리적 특성 변화-예를 들어, 질량 변화, 속도 변화, 가속도 변화 등-을 실시간으로 센싱할 수 없으므로, 로봇의 셀프 밸런싱 및 자율주행의 기초가 되는 데이터를 수집 및 업데이트할 수 없는 문제점이 있다.In the driving process of conventional Segway-type robots, changes in physical characteristics of the robot (for example, changes in mass, speed, acceleration, etc.) cannot be sensed in real time, so the data that is the basis for the robot's self-balancing and autonomous driving is used. There is a problem with collecting and updating.

또한, KR 10-1816136 B1 등과 같은 종래의 세그웨이형 로봇은 질량(m)과 회전관성(I) 등의 제어 파라미터를 종합적으로 고려하지 않고, 하는 정교한 학습모델을 설계하지 못하는 한계점이 있다.In addition, conventional Segway-type robots such as KR 10-1816136 B1 do not comprehensively consider control parameters such as mass (m) and rotational inertia (I), and have limitations in designing sophisticated learning models.

이러한 배경에서, 본 실시예의 목적은, 일 측면에서, 세그웨이형 로봇의 질량 및 회전과 관련된 파라미터들을 반영하는 강화학습 알고리즘을 적용하고, 로봇의 동작을 실시간 주행상황을 반영하여 제어할 수 있는 세그웨이형 로봇 및 이의 동작 제어방법을 제공하는 것이다.Against this background, the purpose of this embodiment is, in one aspect, to apply a reinforcement learning algorithm that reflects parameters related to the mass and rotation of a Segway-type robot and to control the robot's motion by reflecting real-time driving situations. The purpose is to provide a robot and its motion control method.

본 실시예의 목적은, 다른 측면에서, 세그웨이형 로봇의 바퀴의 미끄러짐(스핀) 여부를 반영하여 로봇의 주행상태 및 학습을 수행할 수 있는 세그웨이형 로봇 및 이의 동작 제어방법을 제공하는 것이다.The purpose of this embodiment is to provide a Segway-type robot and a method of controlling its motion that can perform learning and the driving state of the robot by reflecting whether the wheels of the Segway-type robot slip (spin) or not.

본 실시예의 목적은, 다른 측면에서, 로봇의 주행 상태를 예측하는 딥러닝 학습 알고리즘을 적용하여 셀프밸런싱과 자율주행이 가능한 세그웨이형 로봇 및 이의 동작 제어방법을 제공하는 것이다.In another aspect, the purpose of this embodiment is to provide a Segway-type robot capable of self-balancing and autonomous driving and a method for controlling its motion by applying a deep learning algorithm that predicts the driving state of the robot.

전술한 목적을 달성하기 위하여, 본 발명의 일 실시예는, 강화학습 기반의 로봇 주행상태 제어 시스템에 있어서, 상기 로봇의 IMU 센서가 측정한 관성데이터 및 상기 로봇의 인코더가 측정한 모터의 회전데이터를 포함하는 주행데이터를 수집하는 주행데이터 수집부; 상기 주행정보 수집부에서 획득한 상기 주행데이터로 로봇상태값을 계산하고, 타겟로봇상태값와 비교하는 강화학습을 수행하는 제어로직 연산부; 및 상기 제어로직 연산부에서 획득한 상기 로봇상태값을 기초로 로봇의 주행조건을 제어하는 로봇제어신호를 생성하는 로봇주행 제어부를 포함하는, 로봇 주행상태 제어 시스템을 제공할 수 있다.In order to achieve the above-described object, an embodiment of the present invention provides a reinforcement learning-based robot driving state control system, inertial data measured by the IMU sensor of the robot and rotation data of the motor measured by the encoder of the robot. A driving data collection unit that collects driving data including; A control logic operation unit that calculates a robot state value using the driving data obtained from the driving information collection unit and performs reinforcement learning by comparing it with a target robot state value; A robot driving state control system can be provided, including a robot driving control unit that generates a robot control signal that controls the driving conditions of the robot based on the robot state value obtained from the control logic calculation unit.

로봇 주행상태 제어 시스템에서 상기 로봇은 세그웨이이고, 상기 관성데이터는 상기 로봇의 x축, y축, z축에 대한 각속도값과 가속도값을 포함하고, 상기 회전데이터는 상기 로봇의 바퀴별 회전각도값을 포함하고, 상기 관성데이터를 기반으로 상기 로봇의 자세를 인식하며, 상기 회전데이터를 기반으로 바퀴의 움직임을 인식할 수 있다.In the robot driving state control system, the robot is a Segway, the inertial data includes angular velocity and acceleration values for the x-axis, y-axis, and z-axis of the robot, and the rotation data is a rotation angle value for each wheel of the robot. It includes, the posture of the robot can be recognized based on the inertial data, and the movement of the wheels can be recognized based on the rotation data.

로봇 주행상태 제어 시스템에서 상기 제어로직 연산부는 상기 로봇의 직선운동 가속도값(a)를 입력값으로 두어 상기 로봇의 질량값(m) 및 회전관성값(I)을 계산하여 상기 로봇상태값을 획득할 수 있다.In the robot driving state control system, the control logic calculation unit obtains the robot state value by calculating the robot's mass value (m) and rotational inertia value (I) by setting the linear motion acceleration value (a) of the robot as an input value. can do.

로봇 주행상태 제어 시스템에서 상기 강화학습은 획득된 로봇상태값과 기 설정된 타겟로봇상태값의 차이값을 계산하고, 상기 차이값이 0 또는 기준 범위 이내라고 인정되는 경우에 학습을 종료하며, 상기 차이값이 기준 범위를 벗어난 경우에는 상기 로봇상태값을 업데이트할 수 있다.In the robot driving state control system, the reinforcement learning calculates the difference between the acquired robot state value and the preset target robot state value, and ends learning when the difference value is recognized as 0 or within the standard range, and the difference If the value is outside the standard range, the robot status value can be updated.

로봇 주행상태 제어 시스템에서 상기 강화학습은 시뮬레이션 프로그램에 의해 수행되거나, 또는 사용자 단말기가 전달하는 로봇상태추정값을 수신하여 수행될 수 있다.In the robot driving state control system, the reinforcement learning can be performed by a simulation program, or by receiving a robot state estimate value transmitted from a user terminal.

로봇 주행상태 제어 시스템에서 상기 로봇상태값은 로봇의 질량, 회전관성, 기울기, 기울기변화량, 속도, 가속도, 바퀴의 토크 및 바퀴의 각속도 중 하나 이상을 포함할 수 있다.In the robot driving state control system, the robot state value may include one or more of the robot's mass, rotational inertia, tilt, tilt change amount, speed, acceleration, wheel torque, and wheel angular velocity.

로봇 주행상태 제어 시스템에서 상기 제어로직 연산부는 로봇의 좌우 바퀴의 속도 합을 통해 직선운동 속도데이터를 획득하고, 로봇의 좌우 바퀴의 속도 차를 통해 회전운동 속도데이터를 획득하여 상기 로봇상태값을 추정할 수 있다.In the robot driving state control system, the control logic operation unit estimates the robot state value by acquiring linear motion speed data through the sum of the speeds of the left and right wheels of the robot and rotating motion speed data through the speed difference between the left and right wheels of the robot. can do.

로봇 주행상태 제어 시스템에서 상기 제어로직 연산부가 계산한 로봇상태값에 대응되는 로봇제어신호를 모터드라이버로 전송하여 모터의 동작을 제어하는 로봇주행 제어부를 더 포함할 수 있다.The robot driving state control system may further include a robot driving control unit that controls the operation of the motor by transmitting a robot control signal corresponding to the robot state value calculated by the control logic operation unit to a motor driver.

전술한 목적을 달성하기 위하여, 본 발명의 다른 실시예는, 세그웨이의 주행상태를 판단하고 제어하는 방법에 있어서, 세그웨이의 주행데이터를 수집하는 단계; 상기 주행데이터에 기초하여 바퀴의 스핀여부를 판단하는 단계; 상기 바퀴의 스핀이 발생하지 않는 경우에 세그웨이상태값을 계산하고, 타겟세그웨이상태값과 비교하는 제어로직을 수행하는 단계; 및 상기 세그웨이상태값에 대응되도록 모터의 구동을 제어하는 단계를 포함하는, 세그웨이 주행상태 제어 방법을 제공할 수 있다.In order to achieve the above-described object, another embodiment of the present invention provides a method for determining and controlling the driving state of a Segway, comprising the steps of collecting driving data of the Segway; determining whether the wheel spins based on the driving data; Performing control logic to calculate a Segway state value and compare it with a target Segway state value when the wheel does not spin; A Segway driving state control method can be provided, including the step of controlling the driving of the motor to correspond to the Segway state value.

세그웨이 주행상태 제어 방법에서 상기 세그웨이는 주행데이터를 수집하는 센서들을 포함하고, 상기 주행데이터는 세그웨이의 각속도값과 가속도값을 포함하는 관성데이터 및 상기 세그웨이의 바퀴별 회전각도값을 포함하는 회전데이터를 포함할 수 있다.In the Segway driving state control method, the Segway includes sensors that collect driving data, and the driving data includes inertia data including angular velocity and acceleration values of the Segway and rotation data including rotation angle values for each wheel of the Segway. It can be included.

세그웨이 주행상태 제어 방법에서 상기 세그웨이는 상기 바퀴의 회전각 변위를 측정하는 인코더; 및 상기 인코더가 측정한 바퀴회전수와 타켓바퀴회전수를 비교하여 바퀴의 회전을 판단하는 프로세서를 더 포함할 수 있다.In the Segway driving state control method, the Segway includes an encoder that measures rotation angle displacement of the wheel; And it may further include a processor that determines the rotation of the wheel by comparing the wheel rotation speed measured by the encoder with the target wheel rotation speed.

세그웨이 주행상태 제어 방법에서 상기 바퀴의 스핀여부는 직선이동거리 및 바퀴의 회전거리를 비교하여 결정되고, 바퀴의 회전거리가 직선이동거리 보다 큰 경우 스핀이 존재하는 것으로 판단할 수 있다.In the Segway driving state control method, whether the wheels spin is determined by comparing the straight-line movement distance and the wheel rotation distance, and if the wheel rotation distance is greater than the straight-line movement distance, it can be determined that spin exists.

세그웨이 주행상태 제어 방법에서 상기 세그웨이는 상기 바퀴의 구동력을 제공하는 허브모터; 및 상기 허브모터의 출력데이터를 기 설정된 알고리즘에 적용하여 가속도데이터를 획득하는 프로세서를 더 포함할 수 있다.In the Segway driving state control method, the Segway includes a hub motor that provides driving force for the wheels; And it may further include a processor that obtains acceleration data by applying the output data of the hub motor to a preset algorithm.

세그웨이 주행상태 제어 방법에서 상기 세그웨이의 주행과정에서 발생하는 질량(m) 또는 회전관성(I) 변화로 외란 발생여부를 판단하는 제어 알고리즘을 수행하는 단계를 더 포함할 수 있다.The Segway driving state control method may further include performing a control algorithm that determines whether a disturbance occurs due to a change in mass (m) or rotational inertia (I) that occurs during the driving process of the Segway.

세그웨이 주행상태 제어 방법에서 상기 세그웨이상태값과 상기 타겟세그웨이상태값의 차이가 존재하는 것으로 판단된 경우, 상기 세그웨이상태값을 이터레이션(Iteration)하여 타겟세그웨이상태값으로 수렴하도록 강화학습할 수 있다.In the Segway driving state control method, if it is determined that there is a difference between the Segway state value and the target Segway state value, reinforcement learning can be performed to iterate the Segway state value to converge to the target Segway state value.

세그웨이 주행상태 제어 방법에서 상기 세그웨이상태값에 대응되는 모터제어신호를 모터 드라이버로 전달하여 상기 바퀴의 구동을 개별적으로 제어할 수 있다.In the Segway driving state control method, the driving of the wheels can be individually controlled by transmitting a motor control signal corresponding to the Segway state value to a motor driver.

이상에서 설명한 바와 같이, 본 발명의 일 실시예에 의하면, 세그웨이형 로봇의 질량 및 회전과 관련된 파라미터들을 반영하는 강화학습 알고리즘을 적용하여 실시간 주행상황을 반영하여 로봇의 동작을 제어할 수 있다.As described above, according to an embodiment of the present invention, the operation of the robot can be controlled by reflecting the real-time driving situation by applying a reinforcement learning algorithm that reflects parameters related to the mass and rotation of the Segway-type robot.

본 발명의 일 실시예에 의하면, 세그웨이형 로봇의 바퀴의 미끄러짐(스핀) 여부 및 주행상태 변화를 반영하여 로봇의 주행상태를 제어할 수 있다.According to an embodiment of the present invention, the driving state of the Segway-type robot can be controlled by reflecting whether the wheels of the Segway type robot slip (spin) and changes in the driving state.

본 발명의 일 실시예에 의하면, 세그웨이형 로봇의 모델링 파라미터를 기초로 일부 파라미터를 추출 및 학습하여 셀프밸런싱 및 자율주행을 위한 프로세서의 연산량을 저감할 수 있고, 셀프밸런싱 및 자율주행을 위한 강화학습의 속도를 향상시킬 수 있다.According to an embodiment of the present invention, the amount of calculation of the processor for self-balancing and autonomous driving can be reduced by extracting and learning some parameters based on the modeling parameters of the Segway-type robot, and reinforcement learning for self-balancing and autonomous driving can improve the speed.

도 1은 본 발명의 일 실시예에 따른 세그웨이와 프로세서의 데이터 통신 방법을 나타내는 도면이다.
도 2는 본 발명의 일 실시예에 따른 프로세서의 동작 순서를 설명하는 도면이다.
도 3은 본 발명의 일 실시예에 따른 세그웨이의 동작 제어 순서를 설명하는 도면이다.
도 4는 본 발명의 일 실시예에 따른 세그웨이의 구성도이다.
도 5는 본 발명의 일 실시예에 따른 세그웨이의 모델링 파라미터를 설명하는 도면이다.
도 6은 본 발명의 일 실시예에 따른 세그웨이의 모델링 파라미터의 계산방법을 설명하는 도면이다.
도 7은 본 발명의 일 실시예에 따른 강화학습의 방법을 예시하는 도면이다.
도 8은 본 발명의 일 실시예에 따른 세그웨이의 제어신호 생성 방법을 설명하는 순서도이다.Figure 1 is a diagram showing a data communication method between Segway and a processor according to an embodiment of the present invention.
Figure 2 is a diagram explaining the operation sequence of a processor according to an embodiment of the present invention.
Figure 3 is a diagram explaining the operation control sequence of Segway according to an embodiment of the present invention.
Figure 4 is a configuration diagram of a Segway according to an embodiment of the present invention.
Figure 5 is a diagram explaining modeling parameters of Segway according to an embodiment of the present invention.
Figure 6 is a diagram illustrating a method of calculating modeling parameters of a Segway according to an embodiment of the present invention.
Figure 7 is a diagram illustrating a reinforcement learning method according to an embodiment of the present invention.
Figure 8 is a flowchart explaining a method of generating a control signal for a Segway according to an embodiment of the present invention.

이하, 본 발명의 일부 실시예들을 예시적인 도면을 통해 상세하게 설명한다. 각 도면의 구성요소들에 참조부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의하여야 한다. 또한 본 발명을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 그 상세한 설명을 생략한다.Hereinafter, some embodiments of the present invention will be described in detail through illustrative drawings. When adding reference numerals to components in each drawing, it should be noted that the same components are given the same reference numerals as much as possible even if they are shown in different drawings. Additionally, in describing the present invention, detailed descriptions of related known configurations or functions that are judged to be likely to obscure the gist of the present invention will be omitted.

또한, 본 발명의 구성요소를 설명하는데 있어서, 제1, 제2, a, b 등의 용어를 사용할 수 있다. 이러한 용어는 그 구성요소를 다른 구성요소와 구별하기 위한 것일 뿐, 그 용어에 의해 해당 구성요소의 본질이나 차례 또는 순서 등이 한정되지 않는다. 어떤 구성요소가 다른 구성요소에 "연결", "결합" 또는 "접속"된다고 기재된 경우, 그 구성요소는 그 다른 구성요소에 직접적으로 연결되거나 또는 접속될 수 있지만, 각 구성요소 사이에 또 다른 구성요소가 "연결", "결합" 또는 "접속"될 수도 있다고 이해되어야 할 것이다.Additionally, in describing the components of the present invention, terms such as first, second, a, and b may be used. These terms are only used to distinguish the component from other components, and the nature, order, or order of the component is not limited by the term. When a component is described as being “connected,” “coupled,” or “connected” to another component, that component may be directly connected or connected to that other component, but there is another component between each component. It will be understood that elements may be “connected,” “combined,” or “connected.”

또한, 본 명세서상에 사용된 용어 '시스템', '서버' 등은 데이터의 저장 및 연산을 위해 클라이언트에게 네트워크를 통해 정보를 송수신하는 컴퓨터 프로그램 또는 장치로 정의될 수 있으나, 이에 제한되지 않는다.Additionally, the terms 'system', 'server', etc. used in this specification may be defined as a computer program or device that transmits and receives information to a client over a network for data storage and calculation, but is not limited thereto.

또한, 본 명세서상에 사용된 용어 '세그웨이', '세그웨이형 로봇', '로봇'은 사용자를 운송하기 위해 구동력을 제공하는 장치로 정의될 수 있고, 필요에 따라 각 용어는 혼용되어 사용될 수 있다.In addition, the terms 'Segway', 'Segway-type robot', and 'robot' used in this specification can be defined as a device that provides driving force to transport a user, and each term can be used interchangeably as needed. .

도 1은 본 발명의 일 실시예에 따른 세그웨이와 프로세서의 데이터 통신 방법을 나타내는 도면이다.Figure 1 is a diagram showing a data communication method between Segway and a processor according to an embodiment of the present invention.

도 1을 참조하면, 일 실시예에 따른 세그웨이(10) 및 프로세서(20)는 데이터 통신을 통해 세그웨이의 동작을 제어할 수 있다.Referring to FIG. 1, the Segway 10 and the processor 20 according to one embodiment can control the operation of the Segway through data communication.

세그웨이(10)는 하나 이상의 바퀴를 포함하는 사용자 운송을 위한 로봇일 수 있다. 예를 들어, 세그웨이(10)는 두 개의 바퀴 및 모터를 포함하고, 각 바퀴의 모터를 제어하기 위한 모터 드라이버를 포함할 수 있고, 프로세서(20)의 연산 결과에 따라 모터 드라이버의 제어신호를 송수신할 수 있다.Segway 10 may be a robot for user transportation that includes one or more wheels. For example, the Segway 10 includes two wheels and a motor, and may include a motor driver for controlling the motor of each wheel, and transmits and receives control signals from the motor driver according to the calculation results of the processor 20. can do.

세그웨이(10)는 관성데이터를 측정할 수 있는 IMU 센서(미도시), 모터의 회전 데이터를 측정할 수 있는 인코더(미도시)를 더 포함할 수 있다.The Segway 10 may further include an IMU sensor (not shown) capable of measuring inertial data and an encoder (not shown) capable of measuring rotation data of the motor.

IMU 센서(미도시)는 관성측정장치(IMU: Inertial Measurement Unit)로서, 로봇의 3개 축-예를 들어, x축, y축, z축-에 대한 각속도값과 가속도값 등의 관성데이터를 측정할 수 있다. 또한, IMU 센서는 3축 가속도와 3축 자이로 센서를 조합하여 로봇의 방향별 기울기 및 자세를 측정할 수 있다. 예를 들어, IMU 센서는 칼만 필터(Kalman filter)를 적용하여 측정값을 획득할 수 있으나, 이에 제한되지 않는다.The IMU sensor (not shown) is an inertial measurement unit (IMU) that collects inertial data such as angular velocity and acceleration values for the three axes of the robot - for example, the x-axis, y-axis, and z-axis. It can be measured. Additionally, the IMU sensor can measure the robot's tilt and posture in each direction by combining a 3-axis acceleration and 3-axis gyro sensor. For example, an IMU sensor can obtain measured values by applying a Kalman filter, but is not limited to this.

인코더(미도시)는 모터의 회전수, 회전속도, 회전각도, 회전방향 등의 회전데이터를 측정할 수 있다. 또한, 인코더는 회전데이터에 기초하여 로봇의 바퀴 움직임을 측정할 수 있다. 예를 들어, 인코더는 광학식 또는 자기식 인코더 등의 다양한 방법의 측정 방법을 채택할 수 있으나, 이에 제한되지 않는다.The encoder (not shown) can measure rotation data such as the number of revolutions, rotation speed, rotation angle, and direction of rotation of the motor. Additionally, the encoder can measure the robot's wheel movements based on rotation data. For example, the encoder may adopt various measurement methods such as optical or magnetic encoders, but is not limited thereto.

프로세서(20)는 세그웨이(10)와 별개로 구분된 서버 등의 연산을 위한 장치일 수 있으나, 세그웨이(10)에 포함되어 세그웨이(10)의 동작을 제어하기 위한 연산을 수행하는 프로그램 및 이를 포함하는 장치일 수 있다.The processor 20 may be a device for calculation, such as a server separate from the Segway 10, but is included in the Segway 10 and includes a program that performs calculations to control the operation of the Segway 10. It may be a device that does this.

세그웨이(10)는 프로세서(20)으로 주행상태에 관한 정보를 전달(S101)할 수 있고, 프로세서(20)는 주행상태 등에 관한 정보를 수신 및 학습하고, 세그웨이(10)의 주행상태를 제어하는 제어신호를 세그웨이(10)로 전달(S102)할 수 있다.The Segway 10 can transmit information about the driving state to the processor 20 (S101), and the processor 20 receives and learns information about the driving state, etc. and controls the driving state of the Segway 10. The control signal can be transmitted to the Segway 10 (S102).

프로세서(20)는 물리적인 연산장치로 구현될 수 있으나, 이에 제한되지 않고 클라우드 서버 등으로 세그웨이(10)의 주행상태 등에 관한 정보를 수신하여 계산하도록 구현될 수 있다.The processor 20 may be implemented as a physical computing device, but is not limited to this and may be implemented to receive information about the driving state of the Segway 10, etc. from a cloud server, etc., and perform calculations.

도 2는 본 발명의 일 실시예에 따른 프로세서의 동작 순서를 설명하는 도면이다.Figure 2 is a diagram explaining the operation sequence of a processor according to an embodiment of the present invention.

도 2를 참조하면, 프로세서(20)는 주행데이터 수집부(21), 제어로직 연산부(22), 로봇주행 제어부(23) 등을 포함할 수 있다. 프로세서(20)의 각 동작을 구현하는 블록은 물리적, 또는 개념적으로 구분된 블록일 수 있다.Referring to FIG. 2, the processor 20 may include a driving data collection unit 21, a control logic operation unit 22, a robot driving control unit 23, etc. Blocks that implement each operation of the processor 20 may be physically or conceptually separate blocks.

주행데이터 수집부(21)는 세그웨이의 주행데이터를 각 센서들로부터 수신하고, 로봇을 제어하는 로봇제어신호를 생성할 수 있다.The driving data collection unit 21 can receive driving data of the Segway from each sensor and generate a robot control signal to control the robot.

세그웨이의 주행데이터는 IMU 센서에서 측정한 관성데이터, 인코더에서 측정한 모터의 회전데이터 등을 포함할 수 있다.Segway's driving data may include inertial data measured by an IMU sensor and motor rotation data measured by an encoder.

제어로직 연산부(22)는 주행정보 수집부에서 획득한 주행데이터로 로봇상태값을 계산하고, 타겟로봇상태값와 비교하는 강화학습을 수행할 수 있다.The control logic operation unit 22 can perform reinforcement learning by calculating the robot state value using the driving data obtained from the driving information collection unit and comparing it with the target robot state value.

로봇상태값은 로봇의 질량, 회전관성, 기울기, 기울기변화량, 속도, 가속도, 바퀴의 토크, 바퀴의 각속도 등의 로봇의 주행상태를 나타내기 위한 파라미터일 수 있다.The robot state value may be a parameter representing the running state of the robot, such as the robot's mass, rotational inertia, tilt, tilt change amount, speed, acceleration, wheel torque, and wheel angular velocity.

예를 들어, 제어로직 연산부(22)는 로봇의 직선운동 가속도값()를 입력값으로 두고, 로봇의 질량값(m) 및 회전관성값(I)을 계산하여 로봇의 로봇상태값을 획득할 수 있다.For example, the control logic calculation unit 22 sets the robot's linear motion acceleration value () as an input value and calculates the robot's mass value (m) and rotational inertia value (I) to obtain the robot state value of the robot. You can.

제어로직 연산부(22)는 기 설정된 알고리즘을 통해 로봇상태값을 획득할 수 있고, 필요에 따라서 목표로 하는 타겟로봇상태값을 획득하기 위하여 머신러닝-예를 들어, 강화학습-을 수행할 수 있다.The control logic operation unit 22 can obtain the robot state value through a preset algorithm, and, if necessary, perform machine learning - for example, reinforcement learning - to obtain the target robot state value. .

제어로직 연산부(22)에서 수행하는 강화학습은 획득된 로봇상태값과 기 설정된 타겟로봇상태값의 차이값을 계산하고, 차이값이 0 또는 기준 범위 이내라고 인정되는 경우에 학습을 종료할 수 있으며, 차이값이 기준 범위를 벗어난 경우에는 로봇상태값을 반복적으로 업데이트할 수 있다.Reinforcement learning performed by the control logic operation unit 22 calculates the difference between the acquired robot state value and the preset target robot state value, and learning can be terminated when the difference value is recognized as 0 or within the standard range. , If the difference value is outside the standard range, the robot status value can be updated repeatedly.

제어로직 연산부(22)에서 수행하는 강화학습은 시뮬레이션 프로그램에 의해 수행되거나, 또는 사용자 단말기가 전달하는 로봇상태추정값을 수신하여 수행될 수 있다. 강화학습이 수렴값에 도달하지 않는 경우이거나, 사용자가 원하는 주행조건을 반영하기 위하여 사용자 단말기에서 전달하는 주행상태값의 입력데이터를 기초로 학습을 수행함으로써, 로봇의 주행 상태의 오프셋을 조정하거나 사용자 편의성을 향상시킬 수 있게 된다.Reinforcement learning performed by the control logic operation unit 22 may be performed by a simulation program or by receiving a robot state estimate value transmitted from a user terminal. In cases where reinforcement learning does not reach the convergence value, or by performing learning based on the input data of the driving state value transmitted from the user terminal to reflect the driving conditions desired by the user, the offset of the robot's driving state is adjusted or the user Convenience can be improved.

제어로직 연산부(22)는 로봇의 좌우 바퀴의 속도 합을 통해 직선운동 속도데이터를 획득하여 로봇상태값을 추정하거나, 로봇의 좌우 바퀴의 속도 차를 통해 회전운동 속도데이터를 획득하여 로봇상태값을 추정할 수 있다. 이 경우 기 설정된 계산식에 의해 주행상태에 관한 파라미터-예를 들어, 도 6의 로봇 직선운동 속도 및 로봇 회전운동 각속도-를 계산할 수 있다.The control logic operation unit 22 estimates the robot state value by acquiring linear motion speed data through the sum of the speeds of the left and right wheels of the robot, or obtains rotational motion speed data through the speed difference between the left and right wheels of the robot to estimate the robot state value. It can be estimated. In this case, parameters related to the driving state - for example, the robot linear motion speed and the robot rotational angular velocity in FIG. 6 - can be calculated using a preset calculation formula.

로봇주행 제어부(24)는 제어로직 연산부에서 획득한 로봇상태값을 기초로 로봇의 주행조건을 제어할 수 있다. 제어로직 연산부(22)에서 반복 학습을 통해 획득한 로봇상태값을 업데이트 하고, 이를 기초로 로봇의 모터 등의 구성요소를 제어하기 위한 로봇제어신호를 실시간으로 변경할 수 있다.The robot driving control unit 24 can control the driving conditions of the robot based on the robot state value obtained from the control logic calculation unit. The control logic operation unit 22 updates the robot state value acquired through repeated learning, and based on this, the robot control signal for controlling components such as the robot's motor can be changed in real time.

로봇주행 제어부(24)는 현재의 로봇상태값과 타겟로봇상태값이 일치하거나 기준이 되는 오차범위 이내에 들어온 것으로 판단되는 경우에만 로봇을 제어하는 로봇제어신호를 전달할 수 있다.The robot driving control unit 24 can transmit a robot control signal to control the robot only when it is determined that the current robot state value and the target robot state value match or are within a standard error range.

로봇주행 제어부(24)는 제어로직 연산부(22)가 계산한 로봇상태값에 대응되는 로봇제어신호를 모터드라이버로 전송하여 모터의 동작을 제어할 수 있다.The robot driving control unit 24 may control the operation of the motor by transmitting a robot control signal corresponding to the robot state value calculated by the control logic operation unit 22 to the motor driver.

도 3은 본 발명의 일 실시예에 따른 세그웨이의 동작 제어 순서를 설명하는 도면이다.Figure 3 is a diagram explaining the operation control sequence of Segway according to an embodiment of the present invention.

도 3을 참조하면, 세그웨이(10)의 동작은 프로세서(20)의 연산 결과를 입력받아 제어될 수 있다.Referring to FIG. 3, the operation of the Segway 10 can be controlled by receiving the operation result of the processor 20.

세그웨이(10)은 IMU 센서(11), 모터 인코더(12), 모터드라이버(13), 좌측 모터(14), 우측 모터(15) 등을 포함할 수 있다.Segway 10 may include an IMU sensor 11, a motor encoder 12, a motor driver 13, a left motor 14, a right motor 15, etc.

IMU 센서(11)는 로봇의 3개 축에 대한 회전 각속도 값과 각 축 방향으로의 가속도 값 등을 측정하는 장치일 수 있고, 모터 인코더(12)는 세그웨이의 바퀴의 각도 기준점에 대한 회전 각도 등을 측정하는 장치일 수 있다.The IMU sensor 11 may be a device that measures the rotational angular velocity values for the three axes of the robot and the acceleration values in each axis direction, and the motor encoder 12 may measure the rotation angle relative to the angular reference point of the Segway wheels, etc. It may be a device that measures .

세그웨이(10)는 좌우측 바퀴에 대응되는 좌측 모터(14), 우측 모터(15), 각 모터를 제어하는 모터드라이버(13) 등을 포함할 수 있다.Segway 10 may include a left motor 14 corresponding to the left and right wheels, a right motor 15, and a motor driver 13 that controls each motor.

모터드라이버(13)는 프로세서(20)로부터 로봇제어신호를 전달받아 좌우측 모터(14, 15)의 회전수, 회전속도, 회전가속도, 회전각도 등의 주행조건을 제어할 수 있다.The motor driver 13 can receive a robot control signal from the processor 20 and control driving conditions such as rotation speed, rotation speed, rotation acceleration, and rotation angle of the left and right motors 14 and 15.

프로세서(20)은 관성데이터 수집부(21-1), 회전데이터 수집부(21-2), 제어로직 연산부(22), 로봇상태값 추정부(23), 로봇주행 제어부(24) 등을 포함할 수 있다.The processor 20 includes an inertial data collection unit 21-1, a rotation data collection unit 21-2, a control logic operation unit 22, a robot state value estimation unit 23, and a robot driving control unit 24. can do.

관성데이터 수집부(21-1)는 IMU 센서(11)가 측정한 관성데이터를 수집하여, 로봇의 기울임 등의 자세를 판단할 수 있다.The inertial data collection unit 21-1 can collect the inertial data measured by the IMU sensor 11 and determine the robot's posture, such as tilt.

회전데이터 수집부(21-2)는 모터의 인코더(12)가 측정한 회전데이터를 수집하여, 바퀴의 움직임 등을 판단할 수 있다.The rotation data collection unit 21-2 can collect rotation data measured by the motor encoder 12 and determine the movement of the wheel.

제어로직 연산부(22)는 관성데이터, 회전데이터 등의 주행데이터를 수집하여 로봇의 주행상태에 관한 주행상태값을 계산할 수 있다. 또한, 제어로직 연산부(22)는 시계열적인 데이터 학습을 통해 주행상태값을 업데이트하여 로봇주행 제어부(24) 로 로봇의 동작을 결정하기 위한 주행상태값을 생성할 수 있다.The control logic calculation unit 22 can collect driving data such as inertial data and rotation data and calculate a driving state value related to the driving state of the robot. In addition, the control logic operation unit 22 can update the driving state value through time-series data learning to generate a driving state value for the robot driving control unit 24 to determine the robot's operation.

제어로직 연산부(22)는 기 설정된 학습 알고리즘에 의해 입력받은 주행데이터를 학습할 수 있고, 학습의 정확도를 판단하여 학습의 지속 여부를 더 결정할 수 있다.The control logic operation unit 22 can learn the input driving data according to a preset learning algorithm, and can further determine whether to continue learning by determining the accuracy of the learning.

제어로직 연산부(22)는 사용자가 희망하는 로봇의 주행상태에 관한 데이터인 타겟주행상태값과 현재의 로봇의 주행상태에 관한 데이터인 주행상태값을 비교하여 학습의 종료 여부를 결정할 수 있다.The control logic operation unit 22 can determine whether to end learning by comparing the target driving state value, which is data about the driving state of the robot desired by the user, with the driving state value, which is data about the current driving state of the robot.

로봇상태값 추정부(23)는 모든 파라미터에 대한 연산을 수행하지 않고, 파라미터들의 우선순위 및 파라미터들의 상관관계를 고려하여 학습 순서를 결정함으로써 학습량을 감소시킬 수 있다. 예를 들어, 도 7과 같이 로봇의 IMU 센서에 의해 획득된 가속도와 타겟가속도를 우선적으로 비교하고, 모터의 인코더에 의해 획득된 각가속도와 타겟각가속도를 그 다음으로 비교할 수 있다.The robot state value estimation unit 23 can reduce the amount of learning by determining the learning order by considering the priorities of the parameters and the correlation between the parameters without performing calculations on all parameters. For example, as shown in Figure 7, the acceleration obtained by the robot's IMU sensor and the target acceleration can be compared first, and the angular acceleration obtained by the motor encoder and the target angular acceleration can be compared next.

로봇상태값 추정부(23)의 파라미터 추정이 완료되지 않거나, 추정된 파라미터가 목표로하는 값과 다르다고 판단된 경우에는 제어로직 연산부(22)로 결과값을 전달하여 다시 학습을 수행할 수 있다.If the parameter estimation of the robot state value estimation unit 23 is not completed or it is determined that the estimated parameters are different from the target value, the result value can be transmitted to the control logic calculation unit 22 to perform learning again.

로봇상태값 추정부(23)는 제어로직 연산부(22)의 하위 블록에 포함될 수 있으나, 필요에 따라 별도의 블록으로 정의되어 순차적인 데이터 연산 및 업데이트를 수행할 수 있다.The robot state value estimation unit 23 may be included in a lower block of the control logic operation unit 22, but if necessary, it may be defined as a separate block to perform sequential data operations and updates.

로봇주행 제어부(24)는 제어로직 연산부(22)에서 결정하여 전달하는 로봇상태값을 기초로 주행제어신호를 생성하여 모터드라이버(13)으로 전달할 수 있다. 로봇주행 제어부(24)가 제어로직 연산부(22)의 연산 결과를 기초로 주행제어신호를 생성하게 되므로, 보다 정교한 주행제어가 가능하게 된다.The robot travel control unit 24 may generate a travel control signal based on the robot state value determined and transmitted by the control logic operation unit 22 and transmit it to the motor driver 13. Since the robot travel control unit 24 generates a travel control signal based on the calculation results of the control logic calculation unit 22, more precise travel control is possible.

여기서 세그웨이(10) 및 프로세서(20)의 구성은 데이터의 수집 및 연산, 제어신호의 전달 과정을 설명하기 위해 구분한 것으로 하나의 로봇에 포함된 구성의 전부 또는 일부로서, 본 발명의 기술적 사상은 도 3에 제한되지 않는다.Here, the configuration of the Segway 10 and the processor 20 is divided to explain the process of collecting and calculating data and transmitting control signals, and is all or part of the configuration included in one robot, and the technical idea of the present invention is It is not limited to Figure 3.

도 4는 본 발명의 일 실시예에 따른 세그웨이의 구성도이다.Figure 4 is a configuration diagram of a Segway according to an embodiment of the present invention.

도 4를 참조하면, 세그웨이(100)은 좌측 바퀴(101), 우측 바퀴(102), 본체부(103), 손잡이부(104) 등을 포함할 수 있다.Referring to FIG. 4, the Segway 100 may include a left wheel 101, a right wheel 102, a main body 103, a handle 104, etc.

세그웨이(100)의 바퀴(101, 102)의 회전 상태에 따라 세그웨이(100)의 직선운동 및 회전운동을 역학적으로 모델링하여 결정할 수 있다. Depending on the rotation state of the wheels 101 and 102 of the Segway 100, the linear and rotational movements of the Segway 100 can be determined by dynamically modeling them.

예를 들어, 세그웨이(100)의 질량중심이 본체(103)와 인접한 위치에 있는 경우에는 세그웨이(100)의 이동을 본체(103)의 이동으로 대체하거나, 무게중심의 위치를 정의하여 계산에 활용할 수 있다.For example, if the center of mass of the Segway 100 is located adjacent to the main body 103, the movement of the Segway 100 can be replaced by the movement of the main body 103, or the position of the center of gravity can be defined and used in calculations. You can.

예를 들어, 바퀴(101, 102)의 회전거리가 세그웨이(100)의 직선 이동거리와 동일한 경우에는 바퀴의 미끄러짐(스핀) 현상이 발생하지 않은 것으로 판단하고, 바퀴의 회전데이터와 세그웨이의 직선 이동거리 데이터의 상관관계를 정의할 수 있다.For example, if the rotation distance of the wheels 101 and 102 is the same as the straight-line movement distance of the Segway 100, it is determined that the wheel slippage (spin) phenomenon has not occurred, and the rotation data of the wheels and the straight-line movement of the Segway are determined. Correlation of distance data can be defined.

또한, 바퀴(101, 102)의 현재 분당 회전수(RPM)가 타겟 분당 회전수(RPM)와 동일한 경우에는 바퀴의 미끄러짐(스핀) 현상이 발생하지 않은 것으로 판단하고, 바퀴의 회전데이터와 세그웨이의 직선 이동거리 데이터의 상관관계를 정의할 수 있다.In addition, if the current revolutions per minute (RPM) of the wheels 101 and 102 are the same as the target revolutions per minute (RPM), it is determined that the slippage (spin) phenomenon of the wheels has not occurred, and the rotation data of the wheels and the Segway Correlation of straight line travel distance data can be defined.

바퀴(101, 102)의 좌우측 속도 합 또는 속도 차를 활용하여 세그웨이(100)의 직선운동 방향의 속도 또는 회전운동 방향의 속도 등을 정의할 수 있다.The speed in the direction of linear movement or the speed in the direction of rotation of the Segway 100 can be defined by using the sum or difference in speed of the left and right sides of the wheels 101 and 102.

손잡이부(104) 또는 본체부(103)는 로봇의 기울기를 결정하기 위해 활용될 수 있다. 예를 들어, 손잡이부(104)의 기둥의 기울기를 로봇의 기울기로 정의할 수 있고, 본체부(103)의 일 평면의 기울기를 로봇의 기울기로 정의할 수 있다.The handle portion 104 or the main body portion 103 may be used to determine the tilt of the robot. For example, the tilt of the pillar of the handle portion 104 can be defined as the tilt of the robot, and the tilt of one plane of the main body portion 103 can be defined as the tilt of the robot.

도 4는 세그웨이(100)의 각 구성의 간소화 방법 및 모델링 방법을 설명하기 위한 것으로, 본 발명의 기술적 사상은 이에 제한되지 않는다.Figure 4 is for explaining a simplification method and modeling method of each component of the Segway 100, and the technical idea of the present invention is not limited thereto.

도 5는 본 발명의 일 실시예에 따른 세그웨이의 모델링 파라미터를 설명하는 도면이다.Figure 5 is a diagram explaining modeling parameters of Segway according to an embodiment of the present invention.

도 5를 참조하면, 세그웨이(100)의 구성을 간소화하여 모델링할 수 있다.Referring to FIG. 5, the configuration of the Segway 100 can be simplified and modeled.

세그웨이(100)의 직선운동의 속도(), 가속도()를 정의할 수 있고, 회전운동의 회전각도(), 회전각속도()를 운동데이터로 정의할 수 있다.Speed of linear movement of Segway (100) ( ), acceleration ( ) can be defined, and the rotation angle of rotational movement ( ), rotational angular speed ( ) can be defined as exercise data.

세그웨이(100)의 좌측바퀴(101)의 회전각도(), 회전각속도(), 회전관성(), 각가속도() 등을 회전데이터로 정의할 수 있고, 우측바퀴(102)의 회전각도(), 회전각속도(), 회전관성(), 각가속도() 등을 회전데이터로 정의할 수 있다.The rotation angle of the left wheel 101 of the Segway 100 ( ), rotational angular speed ( ), rotational inertia ( ), angular acceleration ( ) etc. can be defined as rotation data, and the rotation angle of the right wheel 102 ( ), rotational angular speed ( ), rotational inertia ( ), angular acceleration ( ), etc. can be defined as rotation data.

세그웨이(100)의 기울기 방향의 회전각(), 각속도(), 토크(), 각가속도() 등을 기울기데이터로 정의할 수 있다.The rotation angle in the tilt direction of the Segway 100 ( ), angular velocity ( ), talk( ), angular acceleration ( ), etc. can be defined as slope data.

또한, 세그웨이(100)의 전체 질량(), 전체 회전관성(), 전체 각운동량(), 본체부(103)의 일 평면에서 질량중심까지의 거리(미도시) 등을 정의하여 각 파라미터들의 계산에 활용할 수 있다.In addition, the total mass of Segway 100 ( ), total rotational inertia ( ), total angular momentum ( ), the distance from one plane of the main body 103 to the center of mass (not shown), etc. can be defined and used in the calculation of each parameter.

일 실시예에 따른 상태주행값 획득을 위해 세그웨이(100)의 기울기 파라미터는 로터리 역진자(Inverted Pendulum) 운동방정식에 대입하여 획득된 해일 수 있고, 비선형 방정식의 해는 대해서는 반복 계산 등의 학습을 통해 획득될 수 있다.In order to obtain the state driving value according to one embodiment, the slope parameter of the Segway 100 may be a solution obtained by substituting the equation of motion of a rotary inverted pendulum, and the solution of the nonlinear equation may be obtained through learning such as iterative calculation. can be obtained.

도 6은 본 발명의 일 실시예에 따른 세그웨이의 모델링 파라미터의 계산방법을 설명하는 도면이다.Figure 6 is a diagram illustrating a method of calculating modeling parameters of a Segway according to an embodiment of the present invention.

도 6을 참조하면, 도 5의 세그웨이 모델링 파라미터를 활용하여 로봇의 주행상태에 관한 추정값들을 계산할 수 있다.Referring to FIG. 6, estimated values regarding the driving state of the robot can be calculated using the Segway modeling parameters of FIG. 5.

세그웨이의 직선운동 속도는 좌우측 바퀴의 속도 합을 이용하여 도출할 수 있다.The linear motion speed of a Segway can be derived using the sum of the speeds of the left and right wheels.

예를 들어, 로봇의 직선운동 속도()는 바퀴의 반지름()과 각 바퀴의 회전속도 평균값()의 곱으로 정의할 수 있으며, 로봇의 직선운동 가속도()는 로봇의 직선운동 속도를 시간으로 미분하여 계산할 수 있다.For example, the robot's linear motion speed ( ) is the radius of the wheel ( ) and the average rotation speed of each wheel ( ) can be defined as the product of the linear motion acceleration of the robot ( ) can be calculated by differentiating the linear motion speed of the robot with time.

세그웨이의 회전운동 각속도는 좌우측 바퀴의 속도 차를 이용하여 도출할 수 있다.The angular speed of Segway's rotational motion can be derived using the speed difference between the left and right wheels.

예를 들어, 로봇의 회전운동 각속도()는 각 바퀴의 회전속도의 편차()에 바퀴의 반지름()을 곱한 뒤, 바퀴 사이의 거리()를 나눈 값으로 정의할 수 있으며, 로봇의 회전각도는 로봇의 회전운동 각속도를 시간으로 적분하여 계산할 수 있다.For example, the rotational movement angular velocity of the robot ( ) is the deviation of the rotation speed of each wheel ( ) to the radius of the wheel ( ), then multiply it by the distance between the wheels ( ) can be defined as the divided value, and the robot's rotation angle can be calculated by integrating the robot's rotational motion angular speed with time.

도 6은 세그웨이(100)의 주행상태에 관한 파라미터의 계산 방법을 설명하기 위한 것으로, 본 발명의 기술적 사상은 이에 제한되지 않는다.Figure 6 is for explaining a method of calculating parameters related to the driving state of the Segway 100, and the technical idea of the present invention is not limited thereto.

도 7은 본 발명의 일 실시예에 따른 강화학습의 방법을 예시하는 도면이다.Figure 7 is a diagram illustrating a reinforcement learning method according to an embodiment of the present invention.

도 7을 참조하면, 일 실시예에 따른 강화학습 및 로봇 동작제어의 방법(200)은 센서의 데이터를 수집하는 단계(S201), 로봇의 가속도를 계산하는 단계(S202), 현재가속도와 타겟가속도를 비교하는 단계(S203), 바퀴의 각속도를 계산하는 단계(S204), 현재각속도와 타겟각속도를 비교하는 단계(S205), 로봇의 질량을 계산하는 단계(S206), 로봇의 각가속도를 계산하는 단계(S207), 현재각가속도와 타겟각가속도를 비교하는 단계(S208), 각운동량과 회전관성을 계산하는 단계(S209), 제어 파라미터를 계산하는 단계(S210), 로봇의 동작을 제어하는 단계(S211) 등을 포함할 수 있다.Referring to FIG. 7, the method 200 of reinforcement learning and robot motion control according to an embodiment includes collecting sensor data (S201), calculating the acceleration of the robot (S202), current acceleration and target acceleration. Comparing (S203), calculating the angular velocity of the wheels (S204), comparing the current angular velocity and the target angular velocity (S205), calculating the mass of the robot (S206), calculating the angular acceleration of the robot. (S207), comparing the current angular acceleration and target angular acceleration (S208), calculating angular momentum and rotational inertia (S209), calculating control parameters (S210), controlling the robot's motion (S211), etc. may include.

센서의 데이터를 수집하는 단계(S201)는 로봇의 센서들에 의해 다양한 종류의 센싱값을 데이터의 형식으로 측정 및 수집하는 단계일 수 있다.The step of collecting sensor data (S201) may be a step of measuring and collecting various types of sensing values in the form of data by sensors of the robot.

로봇의 가속도를 계산하는 단계(S202)는 IMU 센서의 측정값을 가속도로 결정하는 단계일 수 있다.The step of calculating the acceleration of the robot (S202) may be a step of determining the measured value of the IMU sensor as acceleration.

현재가속도와 타겟가속도를 비교하는 단계(S203)는 계산된 현재 상태의 로봇의 가속도와 타겟가속도를 비교하는 단계일 수 있다. 만약, 현재가속도와 타겟가속도가 동일하거나 기 설정된 기준범위 내라면 곧바로 각가속도 계산 단계(S207)를 수행할 수 있다. 현재가속도와 타겟가속도가 동일하지 않거나, 기 설정된 기준범위를 벗어난 경우에는 각속도를 계산하는 단계(S204)를 수행할 수 있다.The step of comparing the current acceleration and the target acceleration (S203) may be a step of comparing the calculated acceleration of the robot in the current state and the target acceleration. If the current acceleration and the target acceleration are the same or within a preset reference range, the angular acceleration calculation step (S207) can be performed immediately. If the current acceleration and the target acceleration are not the same or are outside a preset reference range, a step (S204) of calculating the angular velocity can be performed.

바퀴의 각속도를 계산하는 단계(S204)는 좌우측 바퀴/모터의 각속도를 계산하는 단계일 수 있고, 모터 인코더의 측정값을 각속도 계산에 활용할 수 있다.The step of calculating the angular velocity of the wheel (S204) may be a step of calculating the angular velocity of the left and right wheels/motor, and the measured value of the motor encoder can be used to calculate the angular velocity.

현재각속도와 타겟각속도를 비교하는 단계(S205)는 계산된 현재 상태의 로봇의 각속도와 타겟각속도를 비교하는 단계일 수 있다. 만약, 현재각속도와 타겟각속도가 동일하거나 기 설정된 기준범위 내라면 곧바로 로봇의 전체 질량 계산 단계(S206)를 수행할 수 있다. 현재각속도와 타겟각속도가 동일하지 않거나, 기 설정된 기준범위를 벗어난 경우에는 제어 파라미터를 계산하는 단계(S210)를 수행할 수 있다.The step of comparing the current angular velocity and the target angular velocity (S205) may be a step of comparing the calculated angular velocity of the robot in the current state and the target angular velocity. If the current angular velocity and the target angular velocity are the same or within a preset reference range, the step of calculating the total mass of the robot (S206) can be performed immediately. If the current angular velocity and the target angular velocity are not the same or are outside a preset reference range, a step (S210) of calculating control parameters may be performed.

로봇의 질량을 계산하는 단계(S206)는 뉴턴 운동법칙 등의 고전역학과 관련된 식을 기초로 로봇의 전체 질량을 계산하는 단계일 수 있다.The step of calculating the mass of the robot (S206) may be a step of calculating the total mass of the robot based on equations related to classical mechanics such as Newton's laws of motion.

로봇의 각가속도를 계산하는 단계(S207)는 IMU 센서상의 각가속도 측정값을 획득 및 계산하는 단계일 수 있다.The step of calculating the angular acceleration of the robot (S207) may be a step of acquiring and calculating the angular acceleration measurement value on the IMU sensor.

현재각가속도와 타겟각가속도를 비교하는 단계(S208)는 계산된 현재 상태의 로봇의 각가속도와 타겟각가속도를 비교하는 단계일 수 있다. 만약, 현재각가속도와 타겟각가속도가 동일하거나 기 설정된 기준범위 내라면 곧바로 로봇의 제어 파라미터 계산(S210)를 수행할 수 있다. 현재각가속도와 타겟각가속도가 동일하지 않거나, 기 설정된 기준범위를 벗어난 경우에는 각운동량과 회전관성을 계산하는 단계(S209)를 수행할 수 있다.The step of comparing the current angular acceleration and the target angular acceleration (S208) may be a step of comparing the calculated angular acceleration of the robot in the current state and the target angular acceleration. If the current angular acceleration and the target angular acceleration are the same or within a preset reference range, the robot's control parameter calculation (S210) can be performed immediately. If the current angular acceleration and the target angular acceleration are not the same or are outside a preset reference range, a step (S209) of calculating angular momentum and rotational inertia can be performed.

각운동량과 회전관성을 계산하는 단계(S209)는 뉴턴 운동법칙 등의 고전역학과 관련된 식을 기초로 로봇의 각운동량과 회전관성을 계산하는 단계일 수 있다. 로터리 역진자(Inverted Pendulum) 운동방정식에서 토크값을 계산하고, 이전 단계(S207)에서 획득한 각가속도를 기초로 회전관성()을 계산할 수 있다. 각운동량()은 회전관성 공식 및 평행축 정리 등을 이용하여 계산할 수 있다.The step of calculating angular momentum and rotational inertia (S209) may be a step of calculating the angular momentum and rotational inertia of the robot based on equations related to classical mechanics such as Newton's laws of motion. Calculate the torque value from the rotary inverted pendulum equation of motion, and calculate the rotational inertia ( ) can be calculated. Angular momentum ( ) can be calculated using the rotational inertia formula and the parallel axis theorem.

제어 파라미터를 계산하는 단계(S210)는 전술한 로직을 통해 획득한 파라미터들을 결정하고, 학습의 종료 여부를 결정하는 단계일 수 있다.The step of calculating control parameters (S210) may be a step of determining parameters obtained through the above-described logic and deciding whether to end learning.

로봇의 동작을 제어하는 단계(S211)는 확정된 파라미터들을 기초로 획득된 로봇상태값에 대응하는 로봇제어신호를 생성 및 전달하여 로봇의 동작을 제어하는 단계일 수 있다.The step of controlling the operation of the robot (S211) may be a step of controlling the operation of the robot by generating and transmitting a robot control signal corresponding to the robot state value obtained based on the determined parameters.

도 8은 본 발명의 일 실시예에 따른 세그웨이의 제어신호 생성 방법을 설명하는 순서도이다.Figure 8 is a flowchart explaining a method of generating a control signal for a Segway according to an embodiment of the present invention.

도 8을 참조하면, 세그웨이의 주행상태를 판단하고 제어하는 방법(300)은 세그웨이의 주행데이터를 수집하는 단계(S301), 주행데이터에 기초하여 바퀴의 스핀여부를 판단하는 단계(S302), 세그웨이의 주행과정에서 발생하는 외란 발생여부를 판단하는 단계(S303), 세그웨이 제어로직을 수행하는 단계(S304), 세그웨이의 모터 제어신호를 생성하여 모터의 구동을 제어하는 단계(S305) 등을 포함할 수 있다. Referring to FIG. 8, the method 300 of determining and controlling the driving state of the Segway includes collecting driving data of the Segway (S301), determining whether the wheels spin based on the driving data (S302), and determining whether the Segway is spinning. It may include a step of determining whether a disturbance occurs during the driving process of the Segway (S303), a step of performing Segway control logic (S304), and a step of controlling the driving of the motor by generating a motor control signal of the Segway (S305). You can.

세그웨이의 주행데이터를 수집하는 단계(S301)는 세그웨이의 주행 과정에서 발생하는 데이터들을 수집하는 단계일 수 있다. 센서들을 통해 수집된 데이터들은 세그웨이의 주행상태를 정의하고 제어하기 위한 파라미터의 형태로 연산될 수 있다.The step of collecting driving data of the Segway (S301) may be a step of collecting data generated during the driving process of the Segway. Data collected through sensors can be calculated in the form of parameters to define and control the driving state of the Segway.

주행데이터는 세그웨이의 각속도값과 가속도값을 포함하는 관성데이터, 세그웨이의 바퀴별 회전각도값을 포함하는 회전데이터 등의 다양한 데이터 세트를 포함할 수 있다. 예를 들어, 세그웨이의 바퀴에 구동력을 제공하는 모터가 허브모터인 경우, 허브 축을 기준으로 모터의 회전속도, 회전가속도 등의 데이터를 획득할 수 있다. 또한, 모터의 출력데이터를 기 설정된 알고리즘에 적용하여 별도의 변환된 데이터를 획득할 수 있다.Driving data may include various data sets, such as inertia data including angular velocity and acceleration values of the Segway, and rotation data including rotation angle values for each wheel of the Segway. For example, if the motor that provides driving force to the wheels of a Segway is a hub motor, data such as the rotational speed and rotational acceleration of the motor can be obtained based on the hub axis. Additionally, separate converted data can be obtained by applying the motor's output data to a preset algorithm.

주행데이터에 기초하여 바퀴의 스핀여부를 판단하는 단계(S302)는 획득된 주행데이터에 기초하여 세그웨이 바퀴의 스핀 여부를 판단하는 단계일 수 있다. 세그웨이 바퀴의 스핀 여부가 발생하는 경우에는 획득된 주행데이터들의 정합성이나 상관관계가 부정확해지게 되므로, 주행 모델링의 단순화 및 정확도 향상을 위해 세그웨이 바퀴의 스핀 여부를 우선적으로 결정할 수 있다.The step of determining whether the wheel spins based on the driving data (S302) may be a step of determining whether the Segway wheel spins based on the obtained driving data. If the spin of the Segway wheel occurs, the consistency or correlation of the obtained driving data becomes inaccurate, so to simplify and improve the accuracy of driving modeling, it is possible to first determine whether the Segway wheel spins.

세그웨이 바퀴의 스핀이 존재하지 않는다고 판단된 경우에만 파라미터 계산 및 세그웨이 주행상태값 계산을 위한 단계로 넘어갈 수 있다.Only when it is determined that spin of the Segway wheel does not exist, it is possible to proceed to the step of calculating parameters and calculating the Segway driving state value.

일 실시예에 따른 세그웨이는 바퀴의 회전각 변위를 측정하는 엔코더, 엔코더가 측정한 바퀴회전수와 타켓바퀴회전수를 비교하여 바퀴의 회전을 판단하는 프로세서를 더 포함할 수 있다. The Segway according to one embodiment may further include an encoder that measures the rotation angle displacement of the wheel, and a processor that determines the rotation of the wheel by comparing the wheel rotation speed measured by the encoder with the target wheel rotation speed.

다른 실시예에 따른 세그웨이의 스핀 여부는 세그웨이의 직선이동거리 및 바퀴의 회전거리를 비교하여 결정되고, 바퀴의 회전거리가 직선이동거리 보다 큰 경우 스핀이 존재하는 것으로 판단할 수 있다.According to another embodiment, whether the Segway spins is determined by comparing the straight-line movement distance of the Segway and the rotation distance of the wheels, and if the rotation distance of the wheels is greater than the straight-line movement distance, it can be determined that spin exists.

세그웨이의 주행과정에서 발생하는 외란 발생여부를 판단하는 단계(S303)는 세그웨이의 주행과정에서 발생하는 질량(m) 또는 회전관성(I) 등의 데이터 변화값을 모니터링하여 외란 발생여부를 판단하는 단계일 수 있다. 이 과정에서 기 설정된 제어 알고리즘을 수행하거나, 기준치를 초과하는 변화값에 대해서 외란의 발생이라고 판단할 수 있다.The step of determining whether a disturbance occurs during the driving process of the Segway (S303) is a step of determining whether a disturbance occurs by monitoring data changes such as mass (m) or rotational inertia (I) that occur during the driving process of the Segway. It can be. In this process, a preset control algorithm can be performed, or a change value that exceeds the standard value can be judged to be a disturbance.

또한, 외란의 발생 여부의 판단 과정에서 탑승자의 무게를 제외하고, 세그웨이 자체의 무게를 기준으로 외란을 판단할 수 있다.Additionally, in the process of determining whether a disturbance has occurred, the weight of the rider can be excluded and the disturbance can be judged based on the weight of the Segway itself.

세그웨이 제어로직을 수행하는 단계(S304)는 바퀴의 스핀이 발생하지 않는 경우에 세그웨이상태값을 계산하고, 타겟세그웨이상태값과 비교하는 제어로직을 적용하여 데이터 학습을 수행하는 단계일 수 있다.The step of performing the Segway control logic (S304) may be a step of performing data learning by applying control logic that calculates the Segway state value and compares it with the target Segway state value when the wheel spin does not occur.

필요에 따라, 세그웨이 제어로직을 수행하는 단계 이전에 세그웨이의 역학적 모델링을 수행하고, 모델링에 필요한 파라미터를 추출하는 단계를 미리 수행할 수 있다. 학습 알고리즘에 사용되는 파라미터의 수에 따라 연산 자유도가 증가하게 되므로, 기 설정된 범위 이내의 개수로 파라미터의 수를 제한할 수 있다.If necessary, a step of performing dynamic modeling of the Segway and extracting parameters required for modeling may be performed in advance before performing the Segway control logic. Since the degree of computational freedom increases depending on the number of parameters used in the learning algorithm, the number of parameters can be limited to a number within a preset range.

제어로직은 사용자의 입력 목표치에 대응하는 타겟세그웨이상태값과 현재 상태의 세그웨이상태값을 비교하고, 설정된 순서에 따라 파라미터들을 계산하는 단계일 수 있다.The control logic may be a step of comparing the current Segway status value with the target Segway status value corresponding to the user's input target value and calculating parameters according to a set order.

제어로직은 데이터 정확도 향상을 위해 파라미터들의 업데이트를 반복적으로 수행할 수 있고, 기 설정된 목표 편차 또는 목표 비율에 도달할 때까지 학습을 반복할 수 있다.The control logic can repeatedly update parameters to improve data accuracy, and repeat learning until a preset target deviation or target ratio is reached.

또한, 제어로직은 세그웨이상태값과 타겟세그웨이상태값의 차이가 존재하는 것으로 판단된 경우, 세그웨이상태값을 이터레이션(Iteration) 방식으로 반복 계산하여 타겟세그웨이상태값으로 수렴하도록 강화학습할 수 있다. 예를 들어, 획득된 출력값의 전부 또는 일부가 입력데이터로 활용되는 순환 방식의 이터레이션을 수행할 수 있다.Additionally, if it is determined that there is a difference between the Segway state value and the target Segway state value, the control logic can perform reinforcement learning to converge to the target Segway state value by repeatedly calculating the Segway state value using an iteration method. For example, a cyclical iteration can be performed in which all or part of the obtained output values are used as input data.

세그웨이의 모터 제어신호를 생성하여 모터의 구동을 제어하는 단계(S305)는 세그웨이 모터 제어신호에 따라 모터의 구동을 동시에 또는 이시에 제어하는 단계일 수 있다. 세그웨이의 운동 상태-예를 들어, 직선운동 또는 회전운동-에 따라 모터의 구동은 독립적으로 제어될 수 있다.The step of controlling the driving of the motor by generating the Segway motor control signal (S305) may be a step of controlling the driving of the motor simultaneously or at different times according to the Segway motor control signal. The driving of the motor can be controlled independently depending on the motion state of the Segway - for example, linear motion or rotational motion.

위와 같은 방법을 통해 세그웨이의 동작 과정에서 필요한 방향, 속도 등의 파라미터를 일정하게 유지하는 셀프밸런싱 기능을 수행할 수 있고, 탑승자의 지속적인 조작 없이 자율적으로 파라미터를 업데이트하여 세그웨이 동작을 제어하는 자율주행 기능을 구현할 수 있다.Through the above method, it is possible to perform a self-balancing function that maintains constant parameters such as direction and speed required during the Segway's operation process, and an autonomous driving function that controls Segway operation by autonomously updating parameters without continuous operation by the rider. can be implemented.

본 발명의 기술적 사상에 따른 세그웨이 주행상태 제어 방법(300)은 도 8의 순서 및 단계에 제한되지 않고, 각 단계의 순서 중 일부는 생략되거나, 각 단계의 순서는 변경될 수 있다.The Segway driving state control method 300 according to the technical idea of the present invention is not limited to the sequence and steps of FIG. 8, and some of the sequence of each step may be omitted or the sequence of each step may be changed.

Claims

In the reinforcement learning-based Segway type robot driving state control system,
a driving data collection unit that collects driving data including inertial data measured by the IMU sensor of the robot and rotation data of the motor measured by the encoder of the robot;
A control logic operation unit that performs reinforcement learning to calculate a robot state value using the driving data obtained from the driving data collection unit and compares it with a target robot state value; and
A robot driving state control system comprising a robot driving control unit that generates a robot control signal that controls driving conditions of the robot based on the robot state value obtained from the control logic calculation unit.

According to paragraph 1,
The robot is a Segway, the inertial data includes angular velocity and acceleration values for the x-axis, y-axis, and z-axis of the robot, and the rotation data includes rotation angle values for each wheel of the robot,
A robot driving state control system that recognizes the posture of the robot based on the inertial data and recognizes wheel movement based on the rotation data.

According to paragraph 2,
The control logic calculation unit takes the linear motion acceleration value (a) of the robot as an input value and calculates the mass value (m) and rotational inertia value (I) of the robot to obtain the robot state value, controlling the robot driving state. system.

According to paragraph 1,
The reinforcement learning calculates the difference between the acquired robot state value and the preset target robot state value, and ends learning when the difference value is recognized as 0 or within the standard range. If the difference value is outside the standard range, the reinforcement learning ends. A robot driving state control system that updates the robot state value in this case.

According to paragraph 1,
A robot driving state control system in which the reinforcement learning is performed by a simulation program or by receiving a robot state estimate value transmitted by a user terminal.

According to paragraph 1,
The robot state value includes one or more of the robot's mass, rotational inertia, tilt, tilt change amount, speed, acceleration, wheel torque, and wheel angular velocity.

According to paragraph 1,
The control logic operation unit obtains linear motion speed data through the sum of the speeds of the left and right wheels of the robot, and obtains rotational movement speed data through the speed difference between the left and right wheels of the robot to estimate the robot state value. Robot driving state control. system.

According to paragraph 1,
A robot driving state control system further comprising a robot driving control unit that controls the operation of the motor by transmitting a robot control signal corresponding to the robot state value calculated by the control logic operation unit to a motor driver.

In the method of determining and controlling the driving state of a Segway,
Collecting driving data of Segway;
determining whether the wheel spins based on the driving data;
Performing control logic to calculate a Segway state value and compare it with a target Segway state value when the wheel does not spin; and
A Segway driving state control method comprising controlling the driving of the motor to correspond to the Segway state value.

According to clause 9,
The Segway includes sensors that collect driving data,
The driving data includes inertia data including angular velocity and acceleration values of the Segway, and rotation data including rotation angle values for each wheel of the Segway.

According to clause 9,
The Segway includes an encoder that measures rotation angle displacement of the wheel; and
Segway driving state control method further comprising a processor that determines the rotation of the wheel by comparing the wheel rotation speed measured by the encoder with the target wheel rotation speed.

According to clause 9,
A Segway driving state control method wherein whether the wheel spins is determined by comparing the straight-line movement distance and the wheel rotation distance, and when the wheel rotation distance is greater than the straight-line movement distance, it is determined that spin exists.

According to clause 9,
The Segway includes a hub motor that provides driving force for the wheels; and
Segway driving state control method further comprising a processor that obtains acceleration data by applying the output data of the hub motor to a preset algorithm.

According to clause 9,
A Segway driving state control method further comprising the step of performing a control algorithm that determines whether a disturbance occurs based on a change in mass (m) or rotational inertia (I) that occurs during the driving process of the Segway.

According to clause 9,
When it is determined that there is a difference between the Segway state value and the target Segway state value, a Segway driving state control method that performs reinforcement learning to iterate the Segway state value to converge to the target Segway state value.

According to clause 9,
A Segway driving state control method that individually controls the driving of the wheels by transmitting a motor control signal corresponding to the Segway state value to a motor driver.