Automated job flow generation to provide object views in container-supported many task computing

Gespeichert in:
Bibliographische Detailangaben
Titel: Automated job flow generation to provide object views in container-supported many task computing
Patent Number: 11775,341
Publikationsdatum: October 03, 2023
Appl. No: 17/733196
Application Filed: April 29, 2022
Abstract: An apparatus includes a processor to receive a request to provide a view of an object associated with a job flow, and in response to determining that the object is associated with a task type requiring access to a particular resource not accessible to a first interpretation routine: store, within a job queue, a job flow generation request message to cause generation of a job flow definition the defines another job flow for generating the requested view; within a task container in which a second interpretation routine that does have access to the particular resource is executed, generate the job flow definition; store, within a task queue, a job flow generation completion message that includes a copy of the job flow definition; use the job flow definition to perform the other job flow to generate the requested view; and transmit the requested view to the requesting device.
Inventors: SAS Institute Inc. (Cary, NC, US)
Assignees: SAS Institute Inc. (Cary, NC, US)
Claim: 1. An apparatus comprising at least one processor and a storage to store instructions that, when executed by the at least one processor, cause the at least one processor to perform operations comprising: receive, at the at least one processor, and from a requesting device via a network, a request to provide a view of an object associated with a job flow, wherein: the job flow is defined in a job flow definition that specifies a set of tasks to be performed via execution of a corresponding set of task routines within a set of node devices during a performance of the job flow; analyze the object to determine whether the object is associated with a task type that, during a performance of the job flow, requires access to a particular resource that is not accessible to a first interpretation routine that is executable by the at least one processor to cause generation of views of objects; in response to a determination that the object is associated with the task type, perform operations comprising: store, within a job queue, a first job flow generation request message comprising a request to generate another job flow that causes generation of the requested view of the object when the other job flow is performed, and an indication of the task type; within a first task container, in response to the request to generate the other job flow, the at least one processor is caused to perform operations comprising: analyze the task type to determine whether an instance of a second interpretation routine, that is executable by the at least one processor within the first task container, does have access to the particular resource; and in response to a determination that the instance of the second interpretation routine does have access to the particular resource, perform operations comprising: execute instructions of the instance of the second interpretation routine to generate another job flow definition for the other job flow that specifies another set of tasks to be performed via execution of corresponding other set of task routines to generate the requested view of the object during a performance of the other job flow; and store, within a task queue, a first job flow generation completion message comprising an indication of completion of the generation of the other job flow definition; and after completion of the performance of the other job flow to generate the requested view of the object, transmit a copy of the requested view of the object to the requesting device via the network; and in response to a determination that the object is not associated with the task type, perform operations comprising: execute instructions of the instance of the first interpretation routine to generate the requested view of the object.
Claim: 2. The apparatus of claim 1 , wherein, in response to a determination that the object is not associated with the task type, the at least one processor is caused to perform operations comprising: before commencement of execution of instructions of the first interpretation routine to generate the requested view of the object, store, within the job queue, an object view generation message comprising an indication of the generation of the requested view of the object as underway, and an identifier of the instance of the first interpretation routine to provide an indication to another instance of the first interpretation routine of which instance of the first interpretation routine is currently involved in generating the requested view of the object; and after completion of the generation of the requested view of the object, remove the object view generation message.
Claim: 3. The apparatus of claim 1 , wherein, in response to the determination that the object is associated with the task type, the at least one processor is caused to perform operations comprising: within a performance container, in response to the storage of the first job flow generation request message within the job queue, store a second job flow generation request message within a task queue; within the first task container, perform operations comprising: retrieve an indication of the task type from the second job flow generation request message stored within the task queue to enable the analysis of the task type; and after completion of the generation of the other job flow definition, include a copy of the other job flow definition in the first job flow generation completion message stored within the task queue; within the performance container, in response to the storage of the first job flow generation completion message within the task queue, store a second job flow generation completion message within the job queue; and in response to the storage of the second job flow generation completion message within the job queue, store, within the job queue, a job performance request message comprising a request to perform the other job flow to generate the requested view of the object.
Claim: 4. The apparatus of claim 1 , wherein the at least one processor is caused to, perform operations comprising: within a second task container, in response to the request to generate the other job flow, the at least one processor is caused to perform operations comprising: analyze the task type to determine whether an instance of a second interpretation routine, that is executable by the at least one processor within the second task container, does have access to the particular resource; and in response to a determination that the instance of the second interpretation routine does not have access to the particular resource, refrain from performing operations to generate the other job flow definition.
Claim: 5. The apparatus of claim 4 , wherein: the first task container is instantiated within a first node device of the set of node devices; the first node device provides access to the particular resource; the second task container is instantiated within a second node device of the set of node devices; and the second node device does not provide access to the particular resource.
Claim: 6. The apparatus of claim 5 , wherein: the first task container is instantiated within a first virtual machine (VM) that is instantiated within the first node device; the first VM provides the first task container with access to the particular resource; and the second task container is instantiated within a second VM that is instantiated within the second node device.
Claim: 7. The apparatus of claim 1 , wherein: the particular resource comprises an alternate processor incorporated into a particular node device of the set of node devices; the object comprises a data set in which data values are organized in a manner that requires use of the alternate processor to perform at least one operation on the data values; analyzing the object to determine whether the object is associated with the task type comprises at least one of analyzing metadata associated with the object or analyzing a portion of an identifier of the object; and at least one task of the other set of tasks specified in the other job flow definition comprises using the alternate processor to perform the at least one operation on the data values.
Claim: 8. The apparatus of claim 1 , wherein: the particular resource comprises a set of restricted data objects to which access is restricted to being provided by a particular node device of the set of node devices; the object comprises a data object of the set of restricted data objects; analyzing the object to determine whether the object is associated with the task type comprises at least one of analyzing metadata associated with the object or analyzing a portion of an identifier of the object; and at least one task of the other set of tasks specified in the other job flow definition comprises retrieving the object from among the set of restricted objects from within a task container instantiated within the particular node device.
Claim: 9. The apparatus of claim 1 , wherein: the particular resource comprises the second interpretation routine; the second interpretation routine is executable by the at least one processor to interpret instructions written in a particular programming language; the first interpretation routine is not able to be executed by the at least one processor to interpret instructions written in the particular programming language; the object comprises a task routine that comprises instructions written in the particular programming language; analyzing the object to determine whether the object is associated with the task type comprises analyzing at least one of executable instructions or comments within the object, or analyzing a portion of an identifier of the object; and at least one task of the other set of tasks specified in the other job flow definition comprises interpreting the instructions within the task routine that are written in the particular programming language.
Claim: 10. A computer-program product tangibly embodied in a non-transitory machine-readable storage medium, the computer-program product including instructions operable to cause at least one processor to perform operations comprising: receive, at the at least one processor, and from a requesting device via a network, a request to provide a view of an object associated with a job flow, wherein: the job flow is defined in a job flow definition that specifies a set of tasks to be performed via execution of a corresponding set of task routines within a set of node devices during a performance of the job flow; analyze the object to determine whether the object is associated with a task type that, during a performance of the job flow, requires access to a particular resource that is not accessible to a first interpretation routine that is executable by the at least one processor to cause generation of views of objects; in response to a determination that the object is associated with the task type, perform operations comprising: store, within a job queue, a first job flow generation request message comprising a request to generate another job flow that causes generation of the requested view of the object when the other job flow is performed, and an indication of the task type; within a first task container, in response to the request to generate the other job flow, the at least one processor is caused to perform operations comprising: analyze the task type to determine whether an instance of a second interpretation routine, that is executable by the at least one processor within the first task container, does have access to the particular resource; and in response to a determination that the instance of the second interpretation routine does have access to the particular resource, perform operations comprising: execute instructions of the instance of the second interpretation routine to generate another job flow definition for the other job flow that specifies another set of tasks to be performed via execution of corresponding other set of task routines to generate the requested view of the object during a performance of the other job flow; and store, within a task queue, a first job flow generation completion message comprising an indication of completion of the generation of the other job flow definition; and after completion of the performance of the other job flow to generate the requested view of the object, transmit a copy of the requested view of the object to the requesting device via the network; and in response to a determination that the object is not associated with the task type, perform operations comprising: execute instructions of the instance of the first interpretation routine to generate the requested view of the object.
Claim: 11. The computer-program product of claim 10 , wherein, in response to a determination that the object is not associated with the task type, the at least one processor is caused to perform operations comprising: before commencement of execution of instructions of the first interpretation routine to generate the requested view of the object, store, within the job queue, an object view generation message comprising an indication of the generation of the requested view of the object as underway, and an identifier of the instance of the first interpretation routine to provide an indication to another instance of the first interpretation routine of which instance of the first interpretation routine is currently involved in generating the requested view of the object; and after completion of the generation of the requested view of the object, remove the object view generation message.
Claim: 12. The computer-program product of claim 10 , wherein, in response to the determination that the object is associated with the task type, the at least one processor is caused to perform operations comprising: within a performance container, in response to the storage of the first job flow generation request message within the job queue, store a second job flow generation request message within a task queue; within the first task container, perform operations comprising: retrieve an indication of the task type from the second job flow generation request message stored within the task queue to enable the analysis of the task type; and after completion of the generation of the other job flow definition, include a copy of the other job flow definition in the first job flow generation completion message stored within the task queue; within the performance container, in response to the storage of the first job flow generation completion message within the task queue, store a second job flow generation completion message within the job queue; and in response to the storage of the second job flow generation completion message within the job queue, store, within the job queue, a job performance request message comprising a request to perform the other job flow to generate the requested view of the object.
Claim: 13. The computer-program product of claim 10 , wherein the at least one processor is caused to, perform operations comprising: within a second task container, in response to the request to generate the other job flow, the at least one processor is caused to perform operations comprising: analyze the task type to determine whether an instance of a second interpretation routine, that is executable by the at least one processor within the second task container, does have access to the particular resource; and in response to a determination that the instance of the second interpretation routine does not have access to the particular resource, refrain from performing operations to generate the other job flow definition.
Claim: 14. The computer-program product of claim 13 , wherein: the first task container is instantiated within a first node device of the set of node devices; the first node device provides access to the particular resource; the second task container is instantiated within a second node device of the set of node devices; and the second node device does not provide access to the particular resource.
Claim: 15. The computer-program product of claim 14 , wherein: the first task container is instantiated within a first virtual machine (VM) that is instantiated within the first node device; the first VM provides the first task container with access to the particular resource; and the second task container is instantiated within a second VM that is instantiated within the second node device.
Claim: 16. The computer-program product of claim 10 , wherein: the particular resource comprises an alternate processor incorporated into a particular node device of the set of node devices; the object comprises a data set in which data values are organized in a manner that requires use of the alternate processor to perform at least one operation on the data values; analyzing the object to determine whether the object is associated with the task type comprises at least one of analyzing metadata associated with the object or analyzing a portion of an identifier of the object; and at least one task of the other set of tasks specified in the other job flow definition comprises using the alternate processor to perform the at least one operation on the data values.
Claim: 17. The computer-program product of claim 10 , wherein: the particular resource comprises a set of restricted data objects to which access is restricted to being provided by a particular node device of the set of node devices; the object comprises a data object of the set of restricted data objects; analyzing the object to determine whether the object is associated with the task type comprises at least one of analyzing metadata associated with the object or analyzing a portion of an identifier of the object; and at least one task of the other set of tasks specified in the other job flow definition comprises retrieving the object from among the set of restricted objects from within a task container instantiated within the particular node device.
Claim: 18. The computer-program product of claim 10 , wherein: the particular resource comprises the second interpretation routine; the second interpretation routine is executable by the at least one processor to interpret instructions written in a particular programming language; the first interpretation routine is not able to be executed by the at least one processor to interpret instructions written in the particular programming language; the object comprises a task routine that comprises instructions written in the particular programming language; analyzing the object to determine whether the object is associated with the task type comprises analyzing at least one of executable instructions or comments within the object, or analyzing a portion of an identifier of the object; and at least one task of the other set of tasks specified in the other job flow definition comprises interpreting the instructions within the task routine that are written in the particular programming language.
Claim: 19. A computer-implemented method comprising: receiving, by at the at least one processor, and from a requesting device via a network, a request to provide a view of an object associated with a job flow, wherein: the job flow is defined in a job flow definition that specifies a set of tasks to be performed via execution of a corresponding set of task routines within a set of node devices during a performance of the job flow; analyzing, by the at least one processor, the object to determine whether the object is associated with a task type that, during a performance of the job flow, requires access to a particular resource that is not accessible to a first interpretation routine that is executable by the at least one processor to cause generation of views of objects; in response to a determination that the object is associated with the task type, performing operations comprising: storing, within a job queue, a first job flow generation request message comprising a request to generate another job flow that causes generation of the requested view of the object when the other job flow is performed, and an indication of the task type; within a first task container, in response to the request to generate the other job flow, performing operations comprising: analyzing, by the at least one processor, the task type to determine whether an instance of a second interpretation routine, that is executable by the at least one processor within the first task container, does have access to the particular resource; and in response to a determination that the instance of the second interpretation routine does have access to the particular resource, performing operations comprising: executing, by the at least one processor, instructions of the instance of the second interpretation routine to generate another job flow definition for the other job flow that specifies another set of tasks to be performed via execution of corresponding other set of task routines to generate the requested view of the object during a performance of the other job flow; and storing, within a task queue, a first job flow generation completion message comprising an indication of completion of the generation of the other job flow definition; and after completion of the performance of the other job flow to generate the requested view of the object, transmitting, from the at least one processor, a copy of the requested view of the object to the requesting device via the network; or in response to a determination that the object is not associated with the task type, perform operations comprising: executing, by the at least one processor, instructions of the instance of the first interpretation routine to generate the requested view of the object.
Claim: 20. The computer-implemented method of claim 19 , comprising, in response to a determination that the object is not associated with the task type, performing operations comprising: before commencement of execution of instructions of the first interpretation routine to generate the requested view of the object, storing, within the job queue, an object view generation message comprising an indication of the generation of the requested view of the object as underway, and an identifier of the instance of the first interpretation routine to provide an indication to another instance of the first interpretation routine of which instance of the first interpretation routine is currently involved in generating the requested view of the object; and after completion of the generation of the requested view of the object, removing the object view generation message.
Claim: 21. The computer-implemented method of claim 19 , comprising, in response to the determination that the object is associated with the task type, performing operations comprising: within a performance container, in response to the storage of the first job flow generation request message within the job queue, storing a second job flow generation request message within a task queue; within the first task container, performing operations comprising: retrieving, by the at least one processor, an indication of the task type from the second job flow generation request message stored within the task queue to enable the analysis of the task type; and after completion of the generation of the other job flow definition, including, by the at least one processor, a copy of the other job flow definition in the first job flow generation completion message stored within the task queue; within the performance container, in response to the storage of the first job flow generation completion message within the task queue, storing a second job flow generation completion message within the job queue; and in response to the storage of the second job flow generation completion message within the job queue, storing, within the job queue, a job performance request message comprising a request to perform the other job flow to generate the requested view of the object.
Claim: 22. The computer-implemented method of claim 19 , comprising: within a second task container, in response to the request to generate the other job flow, performing operations comprising: analyzing, by the at least one processor, the task type to determine whether an instance of a second interpretation routine, that is executable by the at least one processor within the second task container, does have access to the particular resource; and in response to a determination that the instance of the second interpretation routine does not have access to the particular resource, refraining from performing operations to generate the other job flow definition.
Claim: 23. The computer-implemented method of claim 22 , wherein: the first task container is instantiated within a first node device of the set of node devices; the first node device provides access to the particular resource; the second task container is instantiated within a second node device of the set of node devices; and the second node device does not provide access to the particular resource.
Claim: 24. The computer-implemented method of claim 23 , wherein: the first task container is instantiated within a first virtual machine (VM) that is instantiated within the first node device; the first VM provides the first task container with access to the particular resource; and the second task container is instantiated within a second VM that is instantiated within the second node device.
Claim: 25. The computer-implemented method of claim 19 , wherein: the particular resource comprises an alternate processor incorporated into a particular node device of the set of node devices; the object comprises a data set in which data values are organized in a manner that requires use of the alternate processor to perform at least one operation on the data values; analyzing the object to determine whether the object is associated with the task type comprises at least one of analyzing metadata associated with the object or analyzing a portion of an identifier of the object; and at least one task of the other set of tasks specified in the other job flow definition comprises using the alternate processor to perform the at least one operation on the data values.
Claim: 26. The computer-implemented method of claim 19 , wherein: the particular resource comprises a set of restricted data objects to which access is restricted to being provided by a particular node device of the set of node devices; the object comprises a data object of the set of restricted data objects; analyzing the object to determine whether the object is associated with the task type comprises at least one of analyzing metadata associated with the object or analyzing a portion of an identifier of the object; and at least one task of the other set of tasks specified in the other job flow definition comprises retrieving the object from among the set of restricted objects from within a task container instantiated within the particular node device.
Claim: 27. The computer-implemented method of claim 19 , wherein: the particular resource comprises the second interpretation routine; the second interpretation routine is executable by the at least one processor to interpret instructions written in a particular programming language; the first interpretation routine is not able to be executed by the at least one processor to interpret instructions written in the particular programming language; the object comprises a task routine that comprises instructions written in the particular programming language; analyzing the object to determine whether the object is associated with the task type comprises analyzing at least one of executable instructions or comments within the object, or analyzing a portion of an identifier of the object; and at least one task of the other set of tasks specified in the other job flow definition comprises interpreting the instructions within the task routine that are written in the particular programming language.
Patent References Cited: 7698427 April 2010 Lee
8024405 September 2011 Shukla
8671403 March 2014 Sundarrajan et al.
9313133 April 2016 Yeddanapudi
9454323 September 2016 Dausner
9577972 February 2017 Word
9916135 March 2018 Dube et al.
9946719 April 2018 Bowman
9984004 May 2018 Little
9998418 June 2018 Clark
10042886 August 2018 Saadat-Panah
10169121 January 2019 Vibhor
10185547 January 2019 Sun
10277603 April 2019 Ainscow
10346780 July 2019 Deng
10360053 July 2019 Christensen
10361919 July 2019 Yang
10635642 April 2020 Haggerty
10691501 June 2020 Hussain
10838756 November 2020 Singh et al.
10977081 April 2021 Mandagere et al.
10977111 April 2021 Rungta et al.
11068309 July 2021 Allen
11144363 October 2021 Francis Conde
11171834 November 2021 Bockelmann et al.
11481245 October 2022 Oliver
20020184250 December 2002 Kern
20060029068 February 2006 Frank
20130024872 January 2013 Bobroff et al.
20130232497 September 2013 Jalagam et al.
20130290979 October 2013 Kawano
20130332612 December 2013 Cai
20130347003 December 2013 Whitmore
20140040905 February 2014 Tsunoda
20140067457 March 2014 Nagendra
20150082317 March 2015 You
20150149745 May 2015 Eble
20150205633 July 2015 Kaptur
20170093988 March 2017 Rehaag
20170163647 June 2017 Cernoch
20170255886 September 2017 Schmidt
20180337927 November 2018 Carnahan
20200133728 April 2020 Nataraj
20210182729 June 2021 George








Other References: Fakhfakh etal; “Towards a Provisioning Algorithm for Dynamic Workflows in the Cloud”; IEEE 2015; (Fakhfakh_2015.pdf; pp. 35-40) (Year: 2015). cited by examiner
Garg et al.; “Adaptive workflow scheduling in grid computing based on dynamic resource availability”; Karabuk University; Engineering Science and Technology, an International Journal, 2015; (Garg_2015.pdf; pp. 256-267) (Year: 2015). cited by examiner
Yang et al; “A Workflow-based Computational Resource Broker with Information Monitoring in Grids”; GCC'06; IEEE 2006; (Yang_ 2006.pdf; pp. 1-8) (Year: 2006). cited by examiner
Dornemann et al., “On-Demand Resource Provisioning for BPEL workflows using Amazon's Elastic Compute Cloud”, IEEE 2009, pp. 140-147. cited by applicant
Ramirez et al., “Capacity-Driven Scaling Schedules Derivation for Coordinated Elasticity of Containers and Virtual Machines”; IEEE 2019, pp. 177-186. cited by applicant
Chung et al., “Stratus: cost-aware container scheduling in the public cloud”, Carnegie Mellon University; ACM 2018; pp. 121-134. cited by applicant
Sharma et al., “Containers and Virtual Machines at Scale: A Comparative Study,” In Proceedings of the 17th International Middleware Conference (Middleware '16) Association for Computing Machinery, New York, NY, USA, Article 1, pp. 1-13. cited by applicant
Zhang et al., “A Compartative Study of Containers and Virtual Machines in Big Data Environment,” Accepted by Jul. 5, 2018 IEEE International Conference on Cloud Computing, pp. 8. arXiv: 1807.01842 [cs.DC]. cited by applicant
Yildiz et al.; “Fault-Tolerance in Dataflow-based Scientific Workflow Management”; 2010 IEEE 6th World Congress on Services; ( Yildiz_201 0.pdf; pp. 336-343) (Year: 2010). cited by applicant
Primary Examiner: Patel, Hiren P
Attorney, Agent or Firm: KDW FIRM PLLC
Dokumentencode: edspgr.11775341
Datenbank: USPTO Patent Grants
Beschreibung
Abstract:An apparatus includes a processor to receive a request to provide a view of an object associated with a job flow, and in response to determining that the object is associated with a task type requiring access to a particular resource not accessible to a first interpretation routine: store, within a job queue, a job flow generation request message to cause generation of a job flow definition the defines another job flow for generating the requested view; within a task container in which a second interpretation routine that does have access to the particular resource is executed, generate the job flow definition; store, within a task queue, a job flow generation completion message that includes a copy of the job flow definition; use the job flow definition to perform the other job flow to generate the requested view; and transmit the requested view to the requesting device.