Modeling And Simulation Of Distributed Computing Frameworks
Uloženo v:
| Název: | Modeling And Simulation Of Distributed Computing Frameworks |
|---|---|
| Document Number: | 20180025103 |
| Datum vydání: | January 25, 2018 |
| Appl. No: | 15/654310 |
| Application Filed: | July 19, 2017 |
| Abstrakt: | A method receives a second data set that is different from a first data set. A total number of operations based on the second data set using an operation estimator is generated. Also, an aggregate resource cost for the total number of operations based on the second data set using a resource cost estimator is generated. The method generates a simulation driver file including a sequence of operations from the total number of operations and a resource cost for each operation in the sequence of operations from the aggregate resource cost. The method simulates the sequence of operations by performing: requesting an amount of resource used by a respective operation on the simulated distributed computing system; reserving the amount of resource when available in the simulated distributed computing system without executing the respective operation; and calculating a time period associated with a simulated execution time of the respective operation. |
| Claim: | 1. A method comprising: receiving, by the computing device, a second data set that is different from a first data set; generating, by the computing device, a total number of operations based on the second data set using an operation estimator, wherein the operation estimator was trained using information from a hardware distributed computing system that executed an algorithm of an application processing the first data set on a set of hardware devices; generating, by the computing device, an aggregate resource cost for the total number of operations based on the second data set using a resource cost estimator that was trained using the information from the hardware distributed computing system; generating, by the computing device, a simulation driver file including a sequence of operations from the total number of operations and a resource cost for each operation in the sequence of operations from the aggregate resource cost based on logic associated with executing the algorithm of the application; and simulating, by the computing device, the sequence of operations by performing: requesting, by the computing device, an amount of resource used by a respective operation on s simulated distributed computing system, the amount of the resource based on the resource cost of the respective operation; reserving, by the computing device, the amount of resource when available in the simulated distributed computing system without executing the respective operation on the simulated distributed computing system; and calculating, by the computing device, a time period associated with a simulated execution time of the respective operation. |
| Claim: | 2. The method of claim 1, wherein simulating comprises: sending the respective operation to a worker process in a plurality of worker processes, wherein the worker process sends the request for the amount of the resource. |
| Claim: | 3. The method of claim 2, wherein the plurality of worker processes are arranged in a distributed structure based on the logic associated with executing the algorithm. |
| Claim: | 4. The method of claim 2, wherein: the plurality of worker processes are processing multiple portions of an operation or multiple operations in parallel and requesting a respective amount of resources for the multiple portions of the operation or the multiple operations. |
| Claim: | 5. The method of claim 2, wherein: a resource manager checks whether the amount of resource is available in the simulated distributed computing system without having to execute the respective operation. |
| Claim: | 6. The method of claim 1, wherein simulating comprises: maintaining one or more resource models in the simulated distributed computing system, wherein the one or more resource models are used to determine if the amount of resource used by the respective operation is available for the respective operation. |
| Claim: | 7. The method of claim 1, wherein simulating comprises: determining if the amount of resource is available in the simulated distributed computing system; when the amount of resource is available, reserving the amount of resource in the simulated distributed computing system for the time period; and when the amount of resource is not available, queuing the respective operation until the amount of resource is available. |
| Claim: | 8. The method of claim 1, wherein calculating the time period comprises: using the simulated execution time and any time spent queuing as the time period for executing the respective operation. |
| Claim: | 9. The method of claim 1, further comprising: receiving the information from the hardware distributed computing system that executed the algorithm of the application processing the first data set on the set of hardware devices; and training the operation estimator to estimate the number of operations based on the received information; and training the resource cost estimator to estimate the resource cost for the operations based on the received information. |
| Claim: | 10. The method of claim 1, wherein generating the simulator driver file comprises: generating the sequence of operations from the total number of operations; and distributing the total resource cost across the sequence of operations based on a resource distribution model. |
| Claim: | 11. The method of claim 10, wherein the resource distribution model is based on the information from the hardware distributed computing system. |
| Claim: | 12. The method of claim 1, wherein generating the sequence of operations comprises: generating multiple types of operations in the sequence of operations based on the total number of operations using an execution framework used to execute the algorithm of the application. |
| Claim: | 13. The method of claim 12, wherein the execution framework executes different types of operations in a set sequence across multiple distributed computing nodes in the simulated distributed computing system. |
| Claim: | 14. The method of claim 1, wherein the simulated distributed computing system includes a different amount of resources as the hardware distributed computing system. |
| Claim: | 15. The method of claim 1, wherein the simulated distributed computing system includes a different configuration of resources than the hardware distributed computing system. |
| Claim: | 16. The method of claim 1, wherein the simulated distributed computing system includes a number of computing node resources, an amount of memory resources, an amount of storage resources, and an amount of network resources. |
| Claim: | 17. A non-transitory computer-readable storage medium containing instructions, that when executed, control a computer system to be configured for: receiving a second data set that is different from a first data set; generating a total number of operations based on the second data set using an operation estimator, wherein the operation estimator was trained using information from a hardware distributed computing system that executed an algorithm of an application processing the first data set on a set of hardware devices; generating an aggregate resource cost for the total number of operations based on the second data set using a resource cost estimator that was trained using the information from the hardware distributed computing system; generating a simulation driver file including a sequence of operations from the total number of operations and a resource cost for each operation in the sequence of operations from the aggregate resource cost based on logic associated with executing the algorithm of the application; and simulating the sequence of operations by performing: requesting an amount of resource used by a respective operation on a simulated distributed computing system, the amount of the resource based on the resource cost of the respective operation; reserving the amount of resource when available in the simulated distributed computing system without executing the respective operation on the simulated distributed computing system; and calculating a time period associated with a simulated execution time of the respective operation. |
| Claim: | 18. A method comprising: receiving, by a computing device, information from a hardware distributed computing system that is executing an algorithm of an application processes a first data set on a set of hardware devices; training, by the computing device, an operation estimator to estimate a number of operations and a resource cost estimator to estimate a resource cost for the number of operations based on the received information; receiving, by the computing device, a second data set that is different from the first data set; generating, by the computing device, a total number of operations based on the second data set using the operation estimator; generating, by the computing device, an aggregate resource cost for the total number of operations based on the second data set using the resource cost estimator; generating, by the computing device, a simulation driver file including a sequence of operations from the total number of operations and a resource cost for each operation in the sequence of operations from the aggregate resource cost; and simulating, by the computing device, the sequence of operations on a simulated distributed computing system, wherein the simulating uses the resource cost of a respective operation in the second sequence of operations to determine an amount of resource used by the respective operation on the simulated distributed computing system without executing the respective operation on the simulated distributed computing system. |
| Claim: | 19. The method of claim 18, wherein simulating comprises: maintaining one or more resource models in the simulated distributed computing system, wherein the one or more resource models are used to determine if the amount of resource used by the respective operation is available for the respective operation. |
| Claim: | 20. The method of claim 18, wherein simulating comprises: determining if the amount of resource is available in the simulated distributed computing system; when the amount of resource is available, reserving the amount of resource in the simulated distributed computing system for the time period; and when the amount of resource is not available, queuing the respective operation until the amount of resource is available. |
| Current International Class: | 06; 06 |
| Přístupové číslo: | edspap.20180025103 |
| Databáze: | USPTO Patent Applications |
Buďte první, kdo okomentuje tento záznam!