Share to gain more social capita
Written by — Saku Vaittinen, Senior data engineer
Written by — Saku Vaittinen, Senior data engineer
Share to gain more social capita
Let's begin with on how most data teams "use AI" today:
They open an AI chatbot, type a prompt, copy the generated SQL, paste it into your IDE, run it, hit an error, copy the error back into the chat window, and repeat. This isn't AI-assisted development but just swapping Stack Overflow for a chat window. The human is still doing almost all the work, acting as the manual, biological glue between the model and the execution environment. Constant context switching, context window limits, and ultimately underwhelming time-to-value.
If you find yourself doing this, it’s time to stop. The paradigm is changing from copy-paste loops into agentic data development, into the new world where AI agents don't just output text and code, but safely orchestrate, execute, and iterate on workloads directly within and with your data platform.
The difference between an AI chatbot and an agentic workflow is fundamentally who (or more and more what) is behind the steering wheel.
In a traditional setup, the human moves instructions and outputs between the AI and the actual systems. In an agentic setup, the agent operates directly in the development environment, authenticated as the developer, reading from real sources and writing to real targets. It understands the system, maps content, develops models, compares outputs, and hands the work off to the next agent in the chain. All this happens under the developer supervision, without the developer being the every-step operator.
This is only possible when agents have the right context. The agent skills are the instructions for how to interact with specific systems and environments. Agents are, on top of prompts, also given tools to work with. They run on the developer's machine, authenticated with the developer's credentials, within approved tooling. The human stays in the loop; they just stop being the loop.
The "maps" your agents need to understand your data can be tricky to figure out on your own.
Data migration is one of the clearest places to see agentic workflows prove their value, because the work is large in volume, highly repetitive in structure, and brutally unforgiving about correctness.
A typical large-scale migration would be something like thousands of transformation jobs from a legacy platform to a modern stack like Matillion to dbt. This would traditionally require years of developer time. The pattern is always the same: understand the source logic, map it to the target system's conventions, generate the new model, and validate the output against the original. A four-step loop, repeated thousands of times.
Agents handle this as a pipeline. A planning agent reads and understands the source system's structure. A modeling agent maps the legacy logic to the target conventions and generates the new models, e.g. dbt models running on Snowflake, but the pattern stays the same regardless of the stack. A comparison agent then runs both the old and new pipelines against the same data, identifies any mismatches, and returns its findings to the modeling agent for correction. The loop continues until parity is confirmed. Critically: the agents never touch the data directly. Snowflake and dbt do the actual transformation. The agents orchestrate, generate, and validate within the developer's authenticated environment, with every action logged for review.
The entire pipeline is defined as skills stored in version control, reviewable by any developer on the team. When something unexpected happens, the agent escalates rather than guessing forward. This is what makes it safe to run at scale.
A lot of vendors claim you can just throw an agent on top of your data stack and solve your problems overnight. In a small, clean demo that looks magical. The problem is that the real world is not clean and small, quite the opposite. In a complex enterprise environment with legacy systems and messy pipeline implementations with historical layers, it falls apart completely.
AI agents don't replace the need for solid foundations. What they do is they expose every weakness in them faster, this holds true regardless whether it’s about technical systems or business processes. An agent navigating a poorly documented, untested data platform will confidently produce wrong answers. Unlike a human engineer who slows down when something feels off, an unconstrained agent follows bad logic at full speed.
This means disciplined engineering practices matter now more, not less. Agents must write code to version control, run tests in isolated schemas, and submit pull requests for human review. Clean documentation and semantic definitions are the maps agents use to understand what your data actually means. Strict permissioning and guardrails need to be hardcoded into the environment, not trusted to the agent's judgment.
The agentic layer amplifies whatever is beneath it. Build on a weak foundation and you get fast, expensive mistakes. Build on a strong one and you get genuine leverage.
"What do you mean copy-pasting between a chatbot and my IDE isn’t an agentic workflow?"
The goal should never be to replace your data engineering team. Instead, it’s about amplifying what a strong team can achieve by automating repetitive implementation work.
Before deploying agents across your stack, you need to honestly assess whether your architecture is ready to support them. At Recordly, we help organizations get there through readiness assessments, building secure MCP foundations, and delivering proven workflows for debugging, migrations, and automated operations.
The technology has evolved and the actors on your platform have changed. Time to stop copy-pasting and start orchestrating.

Reach out to us whenever you need help with getting agent-ready. Our competent team with over 20 experts knowledgeable in agentic data development is ready to support you!