Machine learning device, robot control device and robot vision system using machine learning device, and machine learning method

Saved in:
Bibliographic Details
Title: Machine learning device, robot control device and robot vision system using machine learning device, and machine learning method
Patent Number: 11253,999
Publication Date: February 22, 2022
Appl. No: 16/361207
Application Filed: March 22, 2019
Abstract: A machine learning device includes a state observation unit for observing, as state variables, an image of a workpiece captured by a vision sensor, and a movement amount of an arm end portion from an arbitrary position, the movement amount being calculated so as to bring the image close to a target image; a determination data retrieval unit for retrieving the target image as determination data; and a learning unit for learning the movement amount to move the arm end portion or the workpiece from the arbitrary position to a target position. The target position is a position in which the vision sensor and the workpiece have a predetermined relative positional relationship. The target image is an image of the workpiece captured by the vision sensor when the arm end portion or the workpiece is disposed in the target position.
Inventors: FANUC CORPORATION (Yamanashi, JP)
Assignees: FANUC CORPORATION (Yamanashi, JP)
Claim: 1. A machine learning device comprising: a state observation unit for observing, as state variables, an image of a workpiece captured by a vision sensor in an arbitrary position, and a movement amount of an arm end portion of a robot from the arbitrary position, the movement amount being calculated so as to bring the image close to a target image; a determination data retrieval unit for retrieving the target image as determination data; a learning unit for learning the movement amount of the arm end portion to move the arm end portion or the workpiece from the arbitrary position to a target position in accordance with a training data set that is constituted of a combination of the state variables and the determination data, the learning unit including a reward calculation unit for calculating a reward based on (i) a position of the arm end portion of the robot or the workpiece after each successive movement by the movement amount and (ii) the target image, and a function update unit for updating a function to predict the movement amount of the arm end portion from present state variables based on the reward, wherein the target position is a position in which the vision sensor and the workpiece have a predetermined positional relationship, and the target image is an image of the workpiece captured by the vision sensor when the arm end portion or the workpiece is disposed in the target position; and a decision determination unit for determining an operation command for the robot, based on a result that the learning unit has performed learning in accordance with the training data set, wherein the decision determination unit repeats calculation until the movement amount becomes a predetermined threshold value or less, after the arm end portion has been moved by the movement amount outputted from the machine learning device, and wherein the machine learning device is installed in a cloud server.
Claim: 2. The machine learning device according to claim 1 , wherein the learning unit is configured to learn the movement amount in accordance with the training data set obtained on a plurality of robots.
Claim: 3. The machine learning device according to claim 1 , wherein the learning unit updates an action value table corresponding to the movement amount of the arm end portion, based on the state variables and the reward.
Claim: 4. The machine learning device according claim 1 , wherein the learning unit updates an action value table corresponding to a movement amount of an arm end portion of another robot identical to the robot, based on state variables and a reward of the other robot.
Claim: 5. The machine learning device according to claim 1 , wherein supervised learning is performed with the use of the image of the workpiece captured by the vision sensor disposed in a predetermined position and a data group of a movement amount of the arm end portion from the predetermined position to the target position, as labels.
Claim: 6. The machine learning device according to claim 1 , wherein the learning unit is configured to relearn and update the movement amount of the arm end portion of the robot in accordance with an additional training data set that is constituted of a combination of present state variables and the determination data.
Claim: 7. The machine learning device according to claim 1 further comprising: a target position memory for storing the target position, wherein while the movement amount is repeatedly calculated, the machine learning device learns the movement amount to move the arm end portion or the workpiece from the arbitrary position to the target position stored in the target position memory.
Claim: 8. A robot control device comprising: the machine learning device according to claim 1 ; a target image memory for storing the target image; and a robot controller for controlling the robot in accordance with the determined operation command.
Claim: 9. A robot vision system comprising: the robot control device according to claim 8 ; the robot for performing operation on the workpiece using a tool attached to the arm end portion; and the vision sensor attached to the arm end portion of the robot, for imaging the workpiece.
Claim: 10. The robot vision system according to claim 9 , wherein after the robot has moved to a position by a movement amount obtained by learning of the machine learning device, the robot performs predetermined operation in the position as a starting point.
Claim: 11. The robot vision system according to claim 9 , wherein the machine learning device is connected to the robot control device through a network, and the state observation unit transfers a movement amount calculated by the machine learning device to the robot control device through the network.
Claim: 12. A robot vision system comprising: the robot control device according to claim 8 ; the robot for performing operation on the workpiece using a tool attached to the arm end portion; and the vision sensor fixed on the outside of the robot, for imaging the workpiece.
Claim: 13. The robot vision system according to claim 1 , wherein the reward calculation unit is configured for calculating a reward based on (i) a position of the arm end portion of the robot or the workpiece after a final movement by the movement amount and (ii) the target image.
Claim: 14. A machine learning method, comprising: storing a position in which a vision sensor and a workpiece have a predetermined positional relationship, as a target position; storing an image of the workpiece captured by a vision sensor when an arm end portion of a robot or the workpiece is disposed in the target position, as a target image; observing, as state variables, an image of the workpiece captured in an arbitrary position, and a movement amount of the arm end portion from the arbitrary position, the movement amount being calculated so as to bring the image close to the target image; retrieving the target image from target image memory storing the target image, as determination data; learning the movement amount to move the arm end portion or the workpiece from the arbitrary position to the target position in accordance with a training data set that is constituted of a combination of the state variables and the determination data, wherein said learning comprises: calculating a reward based on (i) a position of the arm end portion of the robot or the workpiece after each successive movement by the movement amount and (ii) the target image, and updating a function to predict the movement amount of the arm end portion from present state variables based on the reward; determining an operation command for the robot, based on a result of the learning in accordance with the training data set; and repeating calculation until the movement amount becomes a predetermined threshold value or less, after moving the arm end portion by the movement amount, wherein the method is performed by a machine learning device installed in a cloud server.
Patent References Cited: 5727132 March 1998 Arimatsu
8256480 September 2012 Weber
9008840 April 2015 Ponulak
2003/0078694 April 2003 Watanabe
2012/0259462 October 2012 Aoba
2012/0296471 November 2012 Inaba
2013/0151007 June 2013 Valpola
2013/0345718 December 2013 Crawford
2016/0184997 June 2016 Uchiyama
2017/0252922 September 2017 Levine
2017/0285584 October 2017 Nakagawa
2018/0126553 May 2018 Corkum
2018/0304467 October 2018 Matsuura
2018/0345496 December 2018 Li
2019/0137954 May 2019 De Magistris
2019/0217476 July 2019 Jiang
2019/0232488 August 2019 Levine
2019/0270200 September 2019 Sakai
2019/0299405 October 2019 Warashina
2020/0101613 April 2020 Yamada
2020/0147804 May 2020 Sugiyama
H4-352201 December 1992
H8-212329 August 1996
H9-76185 March 1997
2003-211381 July 2003
2003-305676 October 2003
2009-83095 April 2009
2010-188432 September 2010
2018-43338 March 2018
2019150911 September 2019
Other References: JP-2019150911 Translation (Year: 2018). cited by examiner
Assistant Examiner: Johnson, Kyle T
Primary Examiner: Burke, Jeff A
Attorney, Agent or Firm: Hauptman Ham, LLP
Accession Number: edspgr.11253999
Database: USPTO Patent Grants
Description
Abstract:A machine learning device includes a state observation unit for observing, as state variables, an image of a workpiece captured by a vision sensor, and a movement amount of an arm end portion from an arbitrary position, the movement amount being calculated so as to bring the image close to a target image; a determination data retrieval unit for retrieving the target image as determination data; and a learning unit for learning the movement amount to move the arm end portion or the workpiece from the arbitrary position to a target position. The target position is a position in which the vision sensor and the workpiece have a predetermined relative positional relationship. The target image is an image of the workpiece captured by the vision sensor when the arm end portion or the workpiece is disposed in the target position.