Specifically, MySQL Autopilot uses internalized machine learning to assist users for some tasks, and applies closed-loop automation to handle others. It assists users by making recommendations on how to provision and load the database, and implements closed-loop automation for failure handling/error recovery and query execution. And with machine learning, Autopilot can iteratively improve query plans based on results from other similar, different queries. The latest release of MySQL HeatWave also increases cluster sizing to 64, more than doubling capacity limits to 32-TByte instances, and raises scalability by 20%. For instance, when provisioning the database, Autopilot conducts a smart scan of the table using adaptive sampling techniques to predict the most optimal cluster sizing. This is based on predicting the amount of memory that will be required for holding the table and the intermediate results in memory. In this case, the output of the model is making recommendations, including an explanation for the recommendation to the customer, who ultimately decides on cluster sizing. Conversely, in the case of query execution, the model automatically schedules queries using logic to reduce contention between short- and long-running queries characteristic of mixed workloads. Without such smart query scheduling, short queries, which are expected to be interactive, could get delayed by long workloads unless a database administrator takes action. With Autopilot, this scenario can be avoided as it predicts the execution time of each query, then prioritizes short queries over long ones to reduce or eliminate waits for interactive queries. Other Autopilot functions include optimizations for loading and deciding where to physically place data. Based on query patterns, the algorithm strives to place rows that are most frequently joined closer together and within the same, or adjoining partitions on the same node. It then predicts the degree of improvement that the user can expect by comparing the proposed optimized placement vs. the default. Explaining the results will be useful for customers deciding whether to take HeatWave’s data placement recommendations. A related feature impacting query planning is auto query plan improvement where the system learns from queries and improves performance of subsequent queries which aren’t necessarily identical with existing queries. By comparison, query planners with conventional relational databases tend to be rules-based, optimizing for cost or performance. In turn, experienced DBAs typically take on the task of schema design to optimize for anticipated queries, and laying out the data (data loading to optimize access). MySQL Autopilot routines automate these tasks using ML, and the company claims that its algorithms can improve performance on benchmarks such as TPC-H (typically used for analytics) by up to 40%. As we have noted with Oracle Autonomous Database, databases mark an ideal use case for applying ML as the problems are well-bounded, the optimizations can be well-defined (e.g., optimize for performance, throughput, or capacity, etc.) and can feed off a rich supply of operational data. Autopilot employs ML models for optimizing query planning and performance; cluster capacity; network overhead; and parallelizing loading operations. Other enhancements in this release include boosts in capacity; while the initial release of Oracle MySQL HeatWave limited clusters to 24 nodes (holding up to 12 TBytes of data), the new release boosts that to 64 nodes and 32 TBytes. System scalability for end-to-end queries has been improved by about 20%. As we noted when Oracle first released the current MySQL Database Service with HeatWave at the end of last year, the Oracle service was not meant to be a copycat of rival cloud services that implemented the core open source MySQL database without modifications. When comparing Oracle MySQL HeatWave with other cloud data warehousing or “data lakehouse” platforms, there are still differences with regard to integration of AutoML or other machine learning services, and there isn’t any capability for running Spark natively. That said, the initial target is organizations seeking to consolidate their MySQL instances and expanding them to analytics. We could see such features coming to HeatWave down the road. HeatWave’s machine learning -driven Autopilot features naturally invite comparisons with Oracle Autonomous Database. But each serves a very different audience. Built on Exadata, the Autonomous Database is intended as an enterprise consolidation platform for Oracle instances that can hit the petabyte range. As noted above, MySQL databases are typically much smaller, tend to be run at department level, with customers who are looking for low cost. In most organizations, there might be dozens, if not hundreds or more MySQL departmental databases floating around. That’s the sweet spot that Oracle is targeting with the MySQL Database Service with HeatWave. As we noted, given that others have beaten Oracle to the starting line with their own vanilla MySQL implementations, Oracle had to do something different. In this release, adding machine learning sends the message to developers that on Oracle Cloud Infrastructure, their MySQL instances should also require less care and feeding and perform better. Disclosure: Oracle is a dbInsight client.