Workload analysis tool for relational databases
Gespeichert in:
| Titel: | Workload analysis tool for relational databases |
|---|---|
| Patent Number: | 7,281,013 |
| Publikationsdatum: | October 09, 2007 |
| Appl. No: | 10/161397 |
| Application Filed: | June 03, 2002 |
| Abstract: | A method for providing workload information in a structured workload information data structure format that is organized according to a workload schema to be conducive to a given end usage of the information. The structured workload information can be made accessible using standard database analytical server applications to facilitate ad-hoc querying of the structured workload information to summarize and analyze the database workload or to facilitate exchange of workload information. A structured workload information (SWI) is constructed according to a SWI schema to facilitate a desired end usage of the workload information. The query information is extracted from the workload and stored in a structured workload information (SWI) data structure according to the schema based on the desired end usage of the information such as ad hoc querying or information exchange. The query information may be stored in a relational database having query information organized as a central fact table and a collection of hierarchical dimension tables or as an OLAP cube featuring hierarchical dimensions that arrange the query information in dimensions having objects ordered as a function of granularity or the information may be stored according to an XML schema wherein units of query information are separated by XML tags that identify a type of workload information. |
| Inventors: | Chaudhuri, Surajit (Redmond, WA, US); Narasayya, Vivek (Redmond, WA, US); Zaki, Omer (Sunnyvale, CA, US) |
| Assignees: | Microsoft Corporation (Redmond, WA, US) |
| Claim: | 1. In a relational database system upon which a query is executed, the relational database system having a workload comprising a series of logged queries and associated query information, a computer-implemented method for populating a structured workload information data structure to facilitate a given usage of workload information by an end user, the method comprising the steps of: extracting the query information from the workload; selecting a predetermined schema from among a plurality of predetermined schema wherein the predetermined schema is one of a relational database schema, an OLAP schema, and an XML schema; and using the selected predetermined schema to transform the query information into the structured workload information data structure as defined by the selected predetermined schema, and storing the structured workload information data structure. |
| Claim: | 2. The method of claim 1 wherein the predetermined schema is selected based on the given usage of the workload information. |
| Claim: | 3. The method of claim 1 wherein the predetermined schema is hierarchical in nature, and objects in the schema are arranged in dimensions wherein objects in a dimension are ordered based on a degree of granularity. |
| Claim: | 4. The method of claim 1 further comprising the step of storing the query information in a relational database. |
| Claim: | 5. The method of claim 4 wherein the query information is organized as a central fact table and a collection of hierarchical dimension tables. |
| Claim: | 6. The method of claim 1 wherein the predetermined schema is an OLAP schema for arranging the query information in an OLAP cube. |
| Claim: | 7. The method of claim 6 wherein the OLAP cube comprises hierarchical dimensions with the query information being arranged in the hierarchical dimensions, the arranged query information having objects ordered as a function of granularity. |
| Claim: | 8. The method of claim 6 wherein the OLAP cube has a dimension for at least one of the following types of query information in the database workload: data objects accessed by the query, a time the query occurred, a user submitting the query, a machine on which the query was submitted, a type of query, physical operators included in the query or an associated query plan, and a nature of predicates in the query. |
| Claim: | 9. The method of claim 1 wherein the step of extracting information from the workload is performed by accessing a query plan for each query of the series of logged queries in the workload. |
| Claim: | 10. The method of claim 1 wherein the query information comprises a plurality of units of information and the step of storing the query information includes appending identifying tags to each of the plurality of units of information. |
| Claim: | 11. The method of claim 10 wherein the identifying tags are XML tags, and wherein the XML tags identify at least one of the following types of workload information: a SQL string for the query, a query category, a list of tables and columns referenced or updated by the query, a total optimizer estimated cost, an estimated cardinailty of the query, and a sequence of logical and physical operators including arguments used irra query plan. |
| Claim: | 12. An apparatus for use on a relational database system upon which a query is executed, the relational database system having a workload comprising a series of logged queries and associated query information, the apparatus for populating a structured workload information data structure to facilitate a given usage of workload information by an end user, the apparatus comprising: an information extractor for extracting query information from the workload; and a structured workload information populator for storing the query information in the structured workload information data structure, where the query information is transformed into the structured workload data structure using a schema wherein the svhema is one of a relational database schema, an OLAP schema, and an XML schema, and where the structured workload information populator is configured to be able perform the transforming. |
| Claim: | 13. The apparatus of claim 12 further comprising an information loader for storing the query information in a relational database. |
| Claim: | 14. The apparatus of claim 12 wherein the structured workload information (SWI) populator stores the query information according to a predetermined schema, the predetermined schema being hierarchical in nature. |
| Claim: | 15. The apparatus of claim 12 wherein the structured workload information populator stores the query information as one or more units of information having identifying tags appended thereto, and wherein the identifying tags indicate a type of query information contained in each of the one or more units of information. |
| Claim: | 16. A computer readable medium embodied data structure for facilitating analysis of workload information regarding a workload for a database, wherein: the workload comprises a set of logged queries and associated query information, the logged queries having been executed against the database; the data structure includes a first data field representing a first dimension, the first data field containing a first data element representing a first type of the query information, the first data element arranged in a hierarchy based on a level of granularity of the query information represented in the first dimension; the data structure includes a second data field representing a second dimension, the second data field containing a second data element representing a second type of the query information, the second data element arranged in a hierarchy based on the level of granularity of the query information represented in the second dimension: and, the data structure imposes a physical structure on the first data element and the second data element to facilitate analysis of workload information. |
| Claim: | 17. The computer readable medium of claim 16 wherein the first type of query information is a time a query occurred and the hierarchy of the first dimension ranges from year to millisecond. |
| Claim: | 18. A computer readable medium having instructions stored thereon for performing computer-implemented method steps for populating a structured workload information data structure to facilitate a given end usage of workload information regarding a workload for a relational database system upon which a query is executed, wherein the workload comprises a series of logged queries and associated query information, the method steps comprising: extracting the query information from the workload; selecting a predetermined schema from among a plurality of predetermined schema wherein the predetermined schema is one of a relational database schema, an OLAP schema, and an XML schema; and using the selected predetermined schema to transform the query information into the structured workload information data structure as defined by the selected predetermined scheme, and storing the structured workload information data structure. |
| Claim: | 19. The computer readable medium of claim 18 wherein the predetermined schema is selected based on the given end usage of the workload information. |
| Claim: | 20. The computer readable medium of claim 18 wherein the predetermined schema is hierarchical in nature, and objects in the schema are arranged in dimensions wherein objects in a dimension are ordered based on a degree of granularity. |
| Claim: | 21. The computer readable medium of claim 18 wherein the method further comprises the step of storing the query information in a relational database. |
| Claim: | 22. The computer readable medium of claim 21 wherein the query information is organized as a central fact table and a collection of hierarchical dimension tables. |
| Claim: | 23. The computer readable medium of claim 18 wherein the predetermined schema is an OLAP schema for arranging query information in an OLAP cube. |
| Claim: | 24. The computer readable medium of claim 23 wherein the OLAP cube comprises hierarchical dimensions, and wherein the query information is arranged in the hierarchical dimensions, the arranged query information having objects ordered as a function of granularity. |
| Claim: | 25. The computer readable medium of claim 23 wherein the OLAP cube has a dimension for at least one of the following types of query information for each of the series of logged queries in the workload: a data object accessed by the query, a time the query occurred, a user submitting the query, a machine on which the query was submitted, a type of query, a list of physical operators included in the query or an associated query plan, and a nature of predicates in the query. |
| Claim: | 26. The computer readable medium of claim 18 wherein the step of extracting information from the workload is performed by accessing a query plan for each query of the series of logged queries in the workload. |
| Claim: | 27. The computer readable medium of claim 18 wherein the query information comprises a plurality of units of information and the step of storing the query information includes appending identifying tags to each of the plurality of units of information. |
| Claim: | 28. The computer readable medium of claim 27 wherein the identifying tags are XML tags, and wherein the XML tags identify at least one of the following types of workload information: a SQL string for the query, a query category, a list of tables and columns referenced or updated by the query, a total optimizer estimated cost, an estimated cardinality of the query, and a sequence of logical and physical operators including arguments used in a query plan. |
| Claim: | 29. A schema embodied on a computer readable medium for structuring workload information regarding a workload for a database, into structured workload information data structures that are structured to allow multi-dimensional analysis of the workload, where a structured workload information data structure comprises a plurality of dimensions organizing query information in a hierarchical fashion according to a level of granularity of the query information; wherein the schema imposes a physical structure on the workload information to facilitate a multi-dimensional analysis of the workload. |
| Claim: | 30. The schema of claim 29 , further comprising a dimension for at least one of the following types of query information in the workload: a data object accessed by a query, a time the query occurred, a user submitting the query, a machine on which the query was submitted, a type of query, a list of physical operators included in the query or an associated query plan, and a nature of predicates in the query. |
| Claim: | 31. A schema embodied on a computer readable medium structuring workload information regarding a workload for a database, into structured workload information data structures so as to facilltate exchange of the workload information between computers, the schema comprising: identifying tags that are appended to workload information, wherein the identifying tags are XML tags, and wherein the XML tags identify at least one of the following types of workload information: a SQL string for the query, a query category, a list of tables and columns referenced or updated by the query, a total optimizer estimated cost, an estimated cardinality of the query, and a sequence of logical and physical operators including arguments used in a query plan; wherein the schema imposes a physical organization on the workload information to facilitate exchange of the workload information between computers. |
| Current U.S. Class: | 707/101 |
| Patent References Cited: | 5960423 September 1999 Chaudhuri et al. 6598038 July 2003 Guay et al. 6601062 July 2003 Deshpande et al. 2001/0037345 November 2001 Kiernan et al. 2003/0065648 April 2003 Driesch et al. |
| Other References: | Gunderloy, M., “Precise/Indepth for SQL Server”, Aug. 2002, Redmondmag.com. cited by other Woodie, A., “Centerfield Addresses DASD Spikes at Protective Shell”, May 28, 2002, vol. 2, No. 20, OS/400 Edition, Midrange Stuff. cited by other Arbi, Z., “BMC's ‘PATROL’good for your applications”, Jun. 5, 2000, Jakarta Post 9. cited by other “CA's New Unicenter Database Management Solutions Greatly Simplify Management of Enterprise-Wide Database Environments”, 2002, PR Newswire, Computer Associates International,. cited by other “The Paradyn Parallel Performance Measurement Tool”, Barton P. Miller, Mark D. Callaghan, Jonathan M. Cargille, et al., University of Wisconsin, Madison, Nov. 1995, pp. 37-46, from IEEE. cited by other Data Management, Exploration & Mining, “AutoAdmin: Self-Tuning and Self-Administering Databases”, Sanjay Agrawal, et al. cited by other “DB2 Advisor: An Optimizer Smart Enough to Recommend Its Own Indexes”, Gary Valentin, Michael Zuliani, Daniel C. Zilio, Guy Lohman and Alan Skelley, IEEE, pp. 101-110, copyright 2000. cited by other |
| Assistant Examiner: | Nguyen, CamLinh |
| Primary Examiner: | Gaffin, Jeffrey |
| Dokumentencode: | edspgr.07281013 |
| Datenbank: | USPTO Patent Grants |
| Abstract: | A method for providing workload information in a structured workload information data structure format that is organized according to a workload schema to be conducive to a given end usage of the information. The structured workload information can be made accessible using standard database analytical server applications to facilitate ad-hoc querying of the structured workload information to summarize and analyze the database workload or to facilitate exchange of workload information. A structured workload information (SWI) is constructed according to a SWI schema to facilitate a desired end usage of the workload information. The query information is extracted from the workload and stored in a structured workload information (SWI) data structure according to the schema based on the desired end usage of the information such as ad hoc querying or information exchange. The query information may be stored in a relational database having query information organized as a central fact table and a collection of hierarchical dimension tables or as an OLAP cube featuring hierarchical dimensions that arrange the query information in dimensions having objects ordered as a function of granularity or the information may be stored according to an XML schema wherein units of query information are separated by XML tags that identify a type of workload information. |
|---|