Building AI Agent Systems with LlamaIndex and SkySQL

May 2, 2025

What we learned building SkyAI Agents with LlamaIndex

Recently, we teamed up with the LlamaIndex crew for a webinar that showed SkySQL’s new SkyAI Agents turning plain‑language questions into reliable answers, all illustrated through a retail inventory‑management use case scenario.

On‑Demand, Accurate SQL: Make‑or‑Break for AI Agents

Agentic applications are moving beyond documents and into operational data. Once an agent is expected to decide what it needs from a live database, two things become true:

It can’t rely on hard‑coded queries. A goal‑oriented agent decides what data it needs as it thinks. If the user’s question involves a long‑standing product, the agent will pull multi‑year sales and seasonality; if the product has just launched, it ignores those empty tables and reaches for live stock levels or competitor prices instead. Because that choice is made on the spot, the agent has to compose new SQL each time… pre‑written queries can’t cover every path the reasoning might take.
Getting the SQL right. Real schemas are messy: hundreds of tables, cryptic column names, partial foreign keys, inconsistent data. If the agent asks the LLM to write SQL, it will hallucinate columns or join on the wrong keys, and nobody will trust the answer.

Delivering on‑the‑fly, reliable SQL is the crux of bringing AI agents to enterprise data.

Putting the agent where the data lives

Current Text‑to‑SQL methods are only good for simple examples. In production with real-world operational data, generating accurate queries demands context that the application tier just doesn’t have.

By placing SkyAI Agents inside SkySQL, we keep them close to the source of truth. The agent can inspect the live schema, sample real rows, and build a vector index of high‑cardinality fields like store names, SKUs, and customer segments, without shipping that metadata anywhere else. When a question arrives, it already knows which tables are relevant, how they join, and which synonyms resolve to which values. It assembles a deterministic SQL plan, runs it, and returns both the answer and a confidence score.

That design delivers two practical advantages:

Authoritative context. Working beside the database means the agent always consults up‑to‑date schemas and constraints, eliminating drift and stale copies.
Actionable confidence. Every response is tagged with a relevance/faithfulness score. Client code can display high‑confidence answers, request clarification for borderline cases, or discard low‑confidence results—no guesswork required.

Application‑level agents still orchestrate tasks and user intent; database‑level agents focus on precise, efficient SQL generation. The separation keeps each layer simple and the answers trustworthy.

How we used LlamaIndex

LlamaIndex supplied the key building blocks:

A retrieval‑augmented pipeline that feeds just‑enough schema context into the LLM.
The SQLTableRetrieverQueryEngine translates that context plus the prompt into syntactically correct SQL.
AgentRunner, which gives us complete control over what gets sent to the LLM in each turn, trading tokens for latency only when the question is genuinely hard.
A pluggable vector‑store layer, so we could slot in MariaDB Vector without forking upstream code.

In short, LlamaIndex handled the orchestration; SkySQL supplied the schema awareness, vector index, and execution sandbox; together, they formed a feedback loop tight enough to keep hallucinations at bay.

Live Demos:

SkyAI Agent Querying

During the webinar, we invoked an “Inventory Optimizer” SkyAI Agent, pointed it at sample retail data, and fired off questions like:

“Which products at Zara are about to stock out?”
“Compare turnover ratios for Zara and H&M.”
“Show seasonal sales differences between Apple and Samsung.”

Each time, the agent generated new SQL, explained its reasoning, and returned a clean JSON answer. The audience saw live that deterministic text‑to‑SQL is possible without writing code or exposing the database to uncontrolled prompts.

MCP‑Driven Dev Workflow

The webinar wrapped with a quick look at our new MCP server—an open‑source endpoint that lets any MCP client (we used Cursor) speak to SkySQL in plain English. From the IDE, we spun up a free serverless database in under a second, listed available SkyAI Agents, and immediately queried the “Inventory Optimizer” agent—all without writing a line of SQL or shell script. For developers, that means the full database lifecycle and agent interface sit one prompt away, directly inside the tools they already use.

Practical Tips for Designing Your DB Agents

Keep agents small and focused. Give each one a clear purpose and no more than about ten tables; if you need broader coverage, treat several narrow agents as tools and route between them. Abstract ugly joins into database views so the LLM sees clean surfaces, and adds friendly synonyms (“Service” → “database service”) or column‑level hints (“use LIKE for string filters”) to cut ambiguity.

Define what’s out of scope as explicitly as what’s in. Store high‑cardinality fields in a vector index for fuzzy matching, but obfuscate sensitive data in any examples you feed the model. Test in short cycles: prompt the agent with increasingly tricky questions, compare the output to a set of “golden” SQL‑answer pairs, and tighten context or prompts until the score is where you need it.

Where to go from here

Watch the webinar on-demand here.

SkyAI Agents are available today in the SkySQL free tier. Spin up a serverless instance, open the agent builder, and ask your schema the questions your business actually cares about. Point your client at the SkyAI endpoint or your own MySQL or MariaDB database and let your application agent delegate the hard SQL work to the database itself.

Bringing goal‑driven agents to operational data doesn’t have to be a leap of faith. With SkySQL, it’s just another API call, one that finally speaks SQL you can trust.