AI Database Agents: Semi-Autonomous AI in SkySQL Cloud

In Database Semi-Autonomous AI Agents

December 11, 2024

Conversational AI isn’t just a tech upgrade—it’s a huge shift in how we interact with applications. It will make apps smarter, more personalized, and yes, a lot more engaging.

Imagine this: A manufacturing assembly line operator receives a fault alert. Instead of trawling through dashboards or calling the tech team, they ask the AI something like, “Which component triggered the shutdown?” Guided by their own experience and probing questions, they trace cascading failures to pinpoint the root cause in record time. Efficiency? Through the roof.

Or picture this: You’re shopping online and casually drop in a request like, “Find me a refrigerator that fits this 68x30x28 space, has a big ice chest, offers old-unit recycling, and costs less than $2,000.” Moments later, the AI serves up perfectly curated options. No filters to tweak, no categories to browse. Just exactly what you need.

And here’s one for the support agents out there: Instead of rephrasing or interpreting a customer’s problem (which is always fun, right?), they pass the question directly to the AI. The AI gets straight to work, generating precise, context-aware responses—no confusion, no endless ticket-passing.

From factory floors to shopping carts to help desks, conversational AI promises not just greater efficiency but a fundamentally better user experience.

We believe Conversational AI will benefit every single operational app—and SaaS platforms stand to gain the most. Most of these apps rely on a OLTP SQL database. Unlike data warehouses or lakes that are optimized for static, historical insights, operational databases are dynamic, capturing live transactions as they happen.

This poses a unique challenge: conversational responses must reflect the current state of the data accurately and consistently. When an AI assistant retrieves insights, any mismatch between what’s real-time and what’s served risks undermining user trust. It’s not just about speed; it’s about precision at scale.

How Do We Enable Conversations on Operational Databases?

The short answer: Use AI Agents.

An Agent is an AI component that translates user queries, reasons through them, and generates responses grounded in context. Our SkyAI semantic agents uses a RAG (Retrieval Augmented Generation) pipeline to retrieve metadata (e.g., table schemas, relationships) and generate SQL queries. The results are executed on your database and synthesized into user-friendly responses.

What is a RAG Pipeline?

A RAG (Retrieval Augmented Generation) pipeline enhances the accuracy and relevance of responses generated by large language models (LLMs). It allows LLMs to incorporate real-time information from external data sources like databases or knowledge bases, effectively grounding their outputs in up-to-date, contextually relevant information.

The Challenges

Building an Agentic RAG pipeline isn’t trivial:

Complexity: Requires stitching together frameworks like LangChain or LlamaIndex, or cloud services like AWS Bedrock.
Consistency & Security: Often involves sending proprietary data to external vector stores or managing it within chat sessions. Ensuring data consistency and compliance with security policies can be challenging.
Hallucinations: Even with a RAG pipeline providing the correct context, all LLMs (tuned or not) can provide imagined answers. It is challenging to evaluate responses for relevance and accuracy - especially when LLMs are primarily used to generate SQL.

While RAG pipelines offer great potential, implementing them for operational databases requires lot of work to ensure accuracy, security, and scalability. The diagram below highlights a typical Agentic RAG architecture built using AI frameworks. Some of these challenges are highlighted below.

What If Your DB Service Could Create and Manage Agents?

SkySQL brings the capability to create and manage Agents directly within the database service, combining semantic understanding, retrieval, and conversational capabilities into a unified DBaaS offering. Your AI application can now create higher level Agents that can orchestrate these lower level DB agents as tools.

Current databases like Postgres, MariaDB, and MongoDB support semantic searches with vector indexing, allowing text or semi-structured data to be queried effectively. However, none provide built-in support for managing Agents, orchestrating RAG pipelines, or engaging LLMs to deliver conversational capabilities.

SkySQL addresses this gap by offering natural language (NL) conversational APIs as a built-in feature. We see this as a logical evolution for databases, similar to how they adapted JSON, time series, and text search capabilities in the past.

Evolution of the relational database

Semi-Autonomous, No-Code Semantic DB Agents

SkySQL includes a No-Code Agent Builder. This tool empowers domain experts to define the missing semantics critical for accurate responses without requiring programming expertise. The system then leverages the database’s metadata—such as table definitions, constraints, and relationships—and learns from historical queries to train the Agent.

However, automation alone isn’t enough. Real-world databases often contain hundreds of tables with cryptic naming conventions, impure data, and hidden rules. This is where the human-in-the-loop design becomes essential. SkySQL engages the user interactively through a wizard-like interface that:

Proposes relevant tables and dimensions based on the Agent’s intent.
Analyzes data to compute initial semantic descriptions for columns and tables.
Allows the user to iteratively refine these semantics.

Users validate and train the Agent by asking questions, inspecting the generated SQL, and tagging “golden SQL” queries that serve as the ground truth. This iterative process ensures the Agent’s outputs are both accurate and contextually relevant.

Under the hood, SkySQL handles:

Vector Indexing of metadata, high-cardinality text columns, and golden SQL to enable efficient semantic searches.
Automatic Orchestration of the RAG pipeline, reducing the need for external integrations and securing all AI interactions.
Online Evaluation of the results for accuracy - when dealing with complexity, incomplete guidance or semantics the responses can be inaccurate. It is important for users or a consuming application to know the quality of the response. We use a “LLM as Judge” approach to provide a confidence and correctness score that is biased against providing false positives. The evaluator is designed to assign lower confidence for uncertain responses rather than risk assigning high confidence to incorrect ones. This approach ensures trustworthy results.

Once trained, the Agent can be consumed via a simple REST API that supports:

Stateful Chat Sessions.
On-demand Natural Language Queries.
Advanced Semantic Searches (coming in the near future).

SkyAI Agent Architecture

Watch demo

Built-in Semantic Agents for Development and DB Administration (Developer and DBA Copilot Agents)

If we can offer Semantic Agents for application use, it stands to reason we can dogfood the same technology for SQL developers and DBAs, bringing the power of conversational AI directly to the database management process.

Today, we offer several built-in agents designed to streamline development and database administration tasks.

1) Developer Copilot Agent for SQL Developers

This agent functions much like modern copilot tools but is specifically tailored for SkySQL and MariaDB. It allows developers to interact with the database using natural language queries, enabling them to quickly find solutions without needing to dive deep into documentation.

You can ask a wide range of questions, such as:

General MariaDB Queries:
- "How can I tune the InnoDB storage engine?"
SkySQL-Specific Queries:
- "Show me a SkySQL program to connect from Java."
- "In SkySQL, how can I configure my DB properties?"

Additionally, the agent can generate complex SQL queries spanning multiple tables, create schemas, write integration code, and even assist with tasks like generating stored procedures or loading data. This agent is trained using the SkySQL documentation and leverages the OpenAI LLM's prior knowledge to provide accurate, context-aware responses.

Example Question

2) DBA Copilot Agent

The DBA Copilot is a specialized agent that helps DBAs with system information, tuning, and diagnostics. It taps directly into SkySQL's built-in system tables and metadata to answer queries about the database's internal state.

When a user asks a question, it breaks the query down into discrete steps, each of which typically gets translated into a SQL statement targeting system tables such as those in information_schema, mysql, or performance_schema. These steps are executed to fetch relevant data and provide actionable insights, making it easier for DBAs to monitor and optimize database performance.

SkySQL’s integration of semi-autonomous AI agents represents a transformative step in application development and database management. By seamlessly combining natural language interfaces, advanced semantic understanding, and robust operational capabilities, SkySQL enables organizations to turn complex data interactions into intuitive, real-time conversations. Whether enabling smarter applications, optimizing performance or enhancing developer productivity, SkySQL sets a new standard for innovation in the operational database landscape.

Ready to experience the future of databases? Try SkySQL using the free serverless tier today or contact us for a demo. Stay tuned for further blogs on AI agents and how to embed them into your apps.