System and Method for Operating Distributed Computer Systems

Uloženo v:
Podrobná bibliografie
Název: System and Method for Operating Distributed Computer Systems
Document Number: 20250103446
Datum vydání: March 27, 2025
Appl. No: 18/474341
Application Filed: September 26, 2023
Abstrakt: A system accesses data of a failed interaction with a target system from a queue and determines whether the failed interaction is a data failure or a system failure. For a data failure, the system determines a category and whether it can be fixed. If it can be fixed, the system updates the data and reprocesses the failed interaction based on the updated data. If it cannot be fixed, the system deletes the data from the queue and notifies the target system the category of the data failure. For a system failure, the system identifies a system trend of the target system and determines whether it can be fixed. If it can be fixed, the system determines a reprocessing schedule and reprocesses the failed interaction accordingly. If it cannot be fixed, the system deletes the data from the queue and notifies the target system of the system trend.
Claim: 1. A system, comprising: a memory operable to store: historical failure data associated with historical failed interactions between a plurality of source systems and a plurality of target systems; and a queue configured to store data associated with a plurality of failed interactions with one or more target systems during operation of distributed computer systems; and a processor operably coupled to the memory and configured to: access, from the queue, a first data associated with a first failed interaction with a first target system; determine, based on an analysis of the accessed first data and the historical failure data by one or more first machine-learning models, whether the first failed interaction is associated with a data failure or a system failure, wherein the analysis comprises a comparison between the accessed first data and the historical failure data by the one or more first machine-learning models to generate an output comprising a probability indicating the data failure or the system failure; and based on determining whether the first failed interaction is associated with a data failure or a system failure: if the first failed interaction is associated with a data failure: determine by one or more second machine-learning models a category of the data failure, wherein the determining comprises analyzing the accessed first data by the one or more second machine-learning models to output a probability of the category of the data failure among a plurality of categories; determine by the one or more second machine-learning models, based on the category, whether the data failure can be fixed by updating the first data, wherein the determining comprises comparing the category of the data failure to a plurality of fixable categories; and based on determining whether the data failure can be fixed:  if the data failure can be fixed:  update the first data by the one or more second machine-learning models; and  reprocess the first failed interaction based on the updated first data;  if the data failure cannot be fixed:  delete the first data from the queue; and  transmit a notification comprising the category of the data failure to the first target system; and if the first failed interaction is associated with a system failure: identify by the one or more second machine-learning models a system trend associated with the first target system by analyzing historical interaction data and success rate for interaction processing associated with the first target system; determine, based on the system trend, whether the system failure can be fixed, wherein the determining comprises analyzing the system trend by the one or more second machine-learning models to predict a confidence level for successfully reprocessing the first failed interaction; and based on determining whether the system failure can be fixed:  if the system failure can be fixed:  determine by the one or more second machine-learning models a reprocessing schedule to reprocess the first failed interaction; and  reprocess the first failed interaction according to the reprocessing schedule; and  if the system failure cannot be fixed:  delete the first data from the queue; and  transmit a notification comprising the system trend to the first target system.
Claim: 2. The system of claim 1, wherein the processor is further configured to encrypt the data associated with the plurality of failed interactions with the one or more target systems.
Claim: 3. The system of claim 1, wherein the first data associated with the first failed interaction with the first target system is encrypted, and wherein the processor is further configured to decrypt the first data associated with the first failed interaction with the first target system.
Claim: 4. The system of claim 1, wherein the system trend comprises an availability of the first target system, and wherein the reprocessing schedule is based on the availability of the first target system.
Claim: 5. The system of claim 4, wherein the processor is further configured to: generate, based on the one or more second machine-learning models, an availability query; transmit, to the first target system, the availability query; and receive, from the first target system, the availability of the first target system.
Claim: 6. The system of claim 1, wherein the processor is further configured to generate the one or more first machine-learning models based on contextual and behavioral signals collected over a period of time from a plurality of applications.
Claim: 7. The system of claim 1, wherein updating the first data by the one or more second machine-learning models comprises one or more of correcting the first data, supplementing the first data, or replacing the first data.
Claim: 8. The system of claim 1, wherein the processor is further configured to determine the first data comprises no sensitive data prior to updating the first data if the data failure can be fixed.
Claim: 9. The system of claim 1, wherein the processor is further configured to: transmit, to the one or more target systems, a plurality of queries for status associated with a plurality of interactions; receive, from the one or more target systems, a plurality of HTTP response codes associated with the plurality of interactions; and determine, based on the plurality of HTTP response codes, one or more of the plurality of interactions failed.
Claim: 10. A method comprising, by one or more computing systems: accessing, from a queue, a first data associated with a first failed interaction with a first target system, wherein the queue is configured to store data associated with a plurality of failed interactions with one or more target systems during operation of distributed computer systems; determining, based on an analysis of the accessed first data and historical failure data by one or more first machine-learning models, whether the first failed interaction is associated with a data failure or a system failure, wherein the analysis comprises a comparison between the accessed first data and the historical failure data by the one or more first machine-learning models to generate an output comprising a probability indicating the data failure or the system failure, and wherein the historical failure data is associated with historical failed interactions between a plurality of source systems and a plurality of target systems; and based on determining whether the first failed interaction is associated with a data failure or a system failure: if the first failed interaction is associated with a data failure: determining by one or more second machine-learning models a category of the data failure, wherein the determining comprises analyzing the accessed first data by the one or more second machine-learning models to output a probability of the category of the data failure among a plurality of categories; determining by the one or more second machine-learning models, based on the category, whether the data failure can be fixed by updating the first data, wherein the determining comprises comparing the category of the data failure to a plurality of fixable categories; and based on determining whether the data failure can be fixed: if the data failure can be fixed:  updating the first data by the one or more second machine-learning models; and  reprocessing the first failed interaction based on the updated first data; if the data failure cannot be fixed:  deleting the first data from the queue; and  transmitting a notification comprising the category of the data failure to the first target system; and if the first failed interaction is associated with a system failure: identifying by the one or more second machine-learning models a system trend associated with the first target system by analyzing historical interaction data and success rate for interaction processing associated with the first target system; determining, based on the system trend, whether the system failure can be fixed, wherein the determining comprises analyzing the system trend by the one or more second machine-learning models to predict a confidence level for successfully reprocessing the first failed interaction; and based on determining whether the system failure can be fixed: if the system failure can be fixed:  determining by the one or more second machine-learning models a reprocessing schedule to reprocess the first failed interaction; and  reprocessing the first failed interaction according to the reprocessing schedule; and if the system failure cannot be fixed:  deleting the first data from the queue; and  transmitting a notification comprising the system trend to the first target system.
Claim: 11. The method of claim 10, further comprising: encrypting the data associated with the plurality of failed interactions with the one or more target systems.
Claim: 12. The method of claim 10, wherein the first data associated with the first failed interaction with the first target system is encrypted, and wherein the method further comprises: decrypting the first data associated with the first failed interaction with the first target system.
Claim: 13. The method of claim 10, wherein the system trend comprises an availability of the first target system, and wherein the reprocessing schedule is based on the availability of the first target system.
Claim: 14. The method of claim 10, further comprising: generating, based on the one or more second machine-learning models, an availability query; transmitting, to the first target system, the availability query; and receiving, from the first target system, the availability of the first target system.
Claim: 15. The method of claim 10, further comprising: generate the one or more first machine-learning models based on contextual and behavioral signals collected over a period of time from a plurality of applications.
Claim: 16. A non-transitory computer-readable medium storing instructions that when executed by a processor cause the processor to: access, from a queue, a first data associated with a first failed interaction with a first target system, wherein the queue is configured to store data associated with a plurality of failed interactions with one or more target systems during operation of distributed computer systems; determine, based on an analysis of the accessed first data and historical failure data by one or more first machine-learning models, whether the first failed interaction is associated with a data failure or a system failure, wherein the analysis comprises a comparison between the accessed first data and the historical failure data by the one or more first machine-learning models to generate an output comprising a probability indicating the data failure or the system failure, and wherein the historical failure data is associated with historical failed interactions between a plurality of source systems and a plurality of target systems; and based on determining whether the first failed interaction is associated with a data failure or a system failure: if the first failed interaction is associated with a data failure: determine by one or more second machine-learning models a category of the data failure, wherein the determining comprises analyzing the accessed first data by the one or more second machine-learning models to output a probability of the category of the data failure among a plurality of categories; determine by the one or more second machine-learning models, based on the category, whether the data failure can be fixed by updating the first data, wherein the determining comprises comparing the category of the data failure to a plurality of fixable categories; and based on determining whether the data failure can be fixed: if the data failure can be fixed:  update the first data by the one or more second machine-learning models; and  reprocess the first failed interaction based on the updated first data; and if the data failure cannot be fixed:  delete the first data from the queue; and  transmit a notification comprising the category of the data failure to the first target system; and if the first failed interaction is associated with a system failure: identify by the one or more second machine-learning models a system trend associated with the first target system by analyzing historical interaction data and success rate for interaction processing associated with the first target system; determine, based on the system trend, whether the system failure can be fixed, wherein the determining comprises analyzing the system trend by the one or more second machine-learning models to predict a confidence level for successfully reprocessing the first failed interaction; and based on determining whether the system failure can be fixed: if the system failure can be fixed:  determine by the one or more second machine-learning models a reprocessing schedule to reprocess the first failed interaction; and  reprocess the first failed interaction according to the reprocessing schedule; and if the system failure cannot be fixed:  delete the first data from the queue; and  transmit a notification comprising the system trend to the first target system.
Claim: 17. The non-transitory computer-readable medium of claim 16, wherein the instructions further cause the processor to encrypt the data associated with the plurality of failed interactions with the one or more target systems.
Claim: 18. The non-transitory computer-readable medium of claim 16, wherein the first data associated with the first failed interaction with the first target system is encrypted, wherein the instructions further cause the processor to decrypt the first data associated with the first failed interaction with the first target system.
Claim: 19. The non-transitory computer-readable medium of claim 16, wherein the system trend comprises an availability of the first target system, and wherein the reprocessing schedule is based on the availability of the first target system.
Claim: 20. The non-transitory computer-readable medium of claim 19, wherein the instructions further cause the processor to: generate, based on the one or more second machine-learning models, an availability query; transmit, to the first target system, the availability query; and receive, from the first target system, the availability of the first target system.
Current International Class: 06; 06
Přístupové číslo: edspap.20250103446
Databáze: USPTO Patent Applications
Popis
Abstrakt:A system accesses data of a failed interaction with a target system from a queue and determines whether the failed interaction is a data failure or a system failure. For a data failure, the system determines a category and whether it can be fixed. If it can be fixed, the system updates the data and reprocesses the failed interaction based on the updated data. If it cannot be fixed, the system deletes the data from the queue and notifies the target system the category of the data failure. For a system failure, the system identifies a system trend of the target system and determines whether it can be fixed. If it can be fixed, the system determines a reprocessing schedule and reprocesses the failed interaction accordingly. If it cannot be fixed, the system deletes the data from the queue and notifies the target system of the system trend.