Systems and methods for managing structured query language on dynamic schema databases

Gespeichert in:
Bibliographische Detailangaben
Titel: Systems and methods for managing structured query language on dynamic schema databases
Patent Number: 12265,539
Publikationsdatum: April 01, 2025
Appl. No: 17/856379
Application Filed: July 01, 2022
Abstract: In various aspects of the present disclosure, systems and methods are described to identify and resolve structured queries so they execute consistently and accurately against any data architecture, and for example, dynamic or unstructured database stores. According to one embodiment, a dynamic schema data system implements a query dialect that is configured to expose underlying flexible schemas of the dynamic schema data system, any structured data, unstructured or partially structured data, and expressive querying native to the dynamic schema system in a language that is compatible with structured queries, and for example, compatible with SQL-92. In further embodiments, the query dialect is configured to enable consistency with existing dynamic schema database query semantics (e.g., the known MongoDB database and associated query semantics).
Inventors: MongoDB, Inc. (New York, NY, US)
Assignees: MongoDB, Inc. (New York, NY, US)
Claim: 1. A distributed database system comprising: at least one processor operatively connected to a memory; a distributed database including data stored under a dynamic schema architecture or an unstructured architecture; a query engine, executed by the at least one processor, configured to: accept a user defined query; execute the user defined query against the distributed database; identify structured query language elements and native query language elements comprising at least in part unstructured query operators in the user defined query; define a first native query stage of execution for the native query language elements and a first structured query language stage for execution of the structured query language elements; map structured query language semantics for execution on unstructured data in the distributed database; execute both of the first structured query stage and the first native query stage on source database data comprising the unstructured data wherein at least some functions of the unstructured query operators are performed on the unstructured data directly in the first native query stage and at least some functions of the structured query language elements are performed on the unstructured data directly in the first structured query stage; and output results of the executed stages, including results from the structured query language semantics for communication to the user or for further processing by another query stage.
Claim: 2. The system of claim 1 , wherein the structured query language semantics are associated with an operation to be performed and the query engine is further configured to map the structured query language semantics to a data environment, the data environment configured to manage execution of the structured query language semantics.
Claim: 3. The system of claim 2 , wherein the query engine is further configured to define a catalog environment and values environment during execution of the first structured query language stage and determine binding values associated with respective data environments on which the operation is to be performed.
Claim: 4. The system of claim 3 , wherein the query engine is further configured to execute the operation on the binding values; and translate the binding values into a native processing format as an output result or input to a further processing stage.
Claim: 5. The system of claim 2 , wherein the query engine is further configured to stream output binding values to subsequent operations in the user defined query.
Claim: 6. The system of claim 1 , wherein the query engine is further configured to preserve semantics of the native query language on which the structured queries are mapped.
Claim: 7. The system of claim 1 , wherein the query engine is further configured to execute compile time evaluation of the user defined query.
Claim: 8. The system of claim 7 , wherein the query engine is further configured to execute the compile time evaluation regardless of availability of schema information for source data.
Claim: 9. The system of claim 7 , wherein the query engine is further configured to infer probable schema information based on the user defined query.
Claim: 10. The system of claim 7 , wherein the query engine is further configured to perform static type-checking and result set metadata computation without requiring source schema information prior to execution of the user defined query.
Claim: 11. A computer implemented method for managing a distributed database system, the method comprising: accepting, by at least one processor, a user defined query; executing, by the at least one processor, the user defined query against the distributed database including data stored under a dynamic schema architecture or an unstructured architecture; identifying, by the at least one processor, structured query language elements and native query language elements comprising at least in part unstructured query operators in the user defined query; defining, by the at least one processor, a first native query stage of execution for the native query language elements and a first structured query language stage for execution of the structured query language elements; mapping, by the at least one processor, structured query language semantics for execution on unstructured data in the distributed database; executing, by the at least one processor, both of the first structured query stage and the first native query stage on source database data comprising the unstructured data wherein at least some functions of the unstructured query operators are performed on the unstructured data directly in the first native query stage and at least some functions of the structured query language elements are performed on the unstructured data directly in the first structured query stage; and outputting, by the at least one processor, results of the executed stages including results from the structured query language semantics for communication to the user or for further processing by another query stage.
Claim: 12. The method of claim 11 , wherein the structured query language semantics are associated with an operation to be performed and the method further comprises: defining, by the at least one processor, a data environment for managing execution of the structured query language semantics; and mapping, by the at least one processor, the structured query language semantics to the data environment on which to perform the operation.
Claim: 13. The method claim 12 , wherein the method further comprises defining, by the at least one processor, a catalog environment and values environment during execution of the first structured query language stage and determining, by the at least one processor, binding values associated with respective data environments on which the operation is to be performed.
Claim: 14. The method of claim 13 , wherein the method further comprises: executing, by the at least one processor, the operation on the binding values; and translating, by the at least one processor, the binding values into a native processing format as an output result or input to a further processing stage.
Claim: 15. The method of claim 12 , wherein the method further comprises streaming, by the at least one processor, output binding values to subsequent operations in the user defined query.
Claim: 16. The method of claim 11 , wherein the method further comprises preserving, by the at least one processor, semantics of the native query language on which the structured queries are mapped.
Claim: 17. The method of claim 11 , wherein the method further comprises executing, by the at least one processor, compile time evaluation of the user defined query.
Claim: 18. The method of claim 17 , wherein the method further comprises executing, by the at least one processor, the compile time evaluation regardless of availability of schema information for source data.
Claim: 19. The method of claim 17 , wherein the method further comprises inferring, by the at least one processor, probable schema information based on the user defined query.
Claim: 20. The method of claim 17 , wherein the method further comprises performing, by the at least one processor, static type-checking and result set metadata computation without requiring source schema information, prior to execution of the user defined query.
Patent References Cited: 11567735 January 2023 Kulkarni
2013/0117288 May 2013 De Smet
2015/0039641 February 2015 Neeman
2015/0120699 April 2015 Faerber
2017/0262516 September 2017 Horowitz
2019/0236182 August 2019 Tiyyagura
Assistant Examiner: Rajaputra, Suman
Primary Examiner: Gofman, Alex
Attorney, Agent or Firm: Wolf, Greenfield & Sacks, P.C.
Dokumentencode: edspgr.12265539
Datenbank: USPTO Patent Grants
Beschreibung
Abstract:In various aspects of the present disclosure, systems and methods are described to identify and resolve structured queries so they execute consistently and accurately against any data architecture, and for example, dynamic or unstructured database stores. According to one embodiment, a dynamic schema data system implements a query dialect that is configured to expose underlying flexible schemas of the dynamic schema data system, any structured data, unstructured or partially structured data, and expressive querying native to the dynamic schema system in a language that is compatible with structured queries, and for example, compatible with SQL-92. In further embodiments, the query dialect is configured to enable consistency with existing dynamic schema database query semantics (e.g., the known MongoDB database and associated query semantics).