Share to gain more social capita
Written by — Mika Heino, Data Architect
2025 put embedding agents and multimodal AI at the heart of data platforms. Here’s what Snowflake, AWS, Google, Microsoft, and Databricks launched in 2025.
Written by — Mika Heino, Data Architect
Share to gain more social capita
After a wild generative‑AI boom in 2023 and a year of vendors racing to push preview features into general availability in 2024, 2025 was the year of embedding agents and multimodal AI straight into the heart of the data platform. Large language models (LLMs) and multimodal functions crawled into the SQL layer so you can classify images, transcribe audio or ask an AI assistant to build a dashboard all without leaving your warehouse. Ingestion and governance didn’t stand still either. What follows is a my annual recap of the biggest product announcements and GA updates across the major platforms.
Personal note: Most of this blog post was written in late November 2025, and I checked the status of each release (both preview and GA) at the time of writing. I double-checked these statuses before publishing the blog. However, there may still be some errors, for which I apologize. I have included links to the features, allowing you to view the correct status by visiting the vendors' sites and release notes.
Also: If you're on mobile, turn your mobile sideways for better reading experience.
What data platforms are in the market then, and how do they differ? The list is rather long - Amazon Redshift, Google BigQuery, Microsoft Fabric, Snowflake, Databricks, Firebolt, Oracle Autonomous Data Warehouse, SAP Datawarehouse Cloud, Teradata Vantage, IBM Db2 Warehouse on Cloud and Cloudera Data Platform are all products on the market the offer capabilities for data warehousing, machine learning, and real-time analytics capabilities.
If you want to check only specific data warehouses and their new product updates, I've created the following table of contents. Additionally, I've included a product comparison matrix where I have placed all the new releases in their respective categories.
Product comparison matrix
Azure / Microsoft Fabric
Databricks
Snowflake
AWS / Amazon Redshift
Google BiqQuery
Conclusion: Toward an AI-native Data Platform
Wishlist for 2026 and beyond

The 2025 releases illustrate how vendors are converging on a full stack of AI‑enabled analytics, ingestion and operational capabilities. The matrix below highlights whether each platform introduced notable features this year.
| Feature | Snowflake | Databricks | Azure | AWS | |
| Serverless Compute "A service where compute is separated from storage" |
Virtual Warehouses |
SQL Serverless |
SQL Analytics endpoint |
BiqQuery |
Redshift Serverless |
| ML "Ability to pre-process, train, manage and deploy ML models" |
Container Services / Snowpark ML / Snowflake Notebooks with Container Runtime |
Databricks Runtime / Runtime for ML |
Synapse Data Science |
Vertex AI / Vertex AI API |
Amazon SageMaker / Amazon Bedrock |
| Application runtime "Ability to host & run any kind of workload in data platform" |
Native Apps / Container Services |
Container Services |
Azure Containers / AKS |
Containers |
Amazon ECS / EKS / Fargate |
| AI & LLM features "Ability to leverage LLM based services in a simpler manner" |
Snowflake Intelligence / |
Agent Bricks / |
Copilot in Fabric |
AI -functions within SQL |
SageMaker + Bedrock (outside the DWH) |
| User Assistance "Ability to assist end users to generate SQL and code with natural language" |
Cortex Code |
Databricks Assistant |
Copilot for several services including Power BI and Fabric Databases |
Duet AI in BiqQuery |
Amazon CodeWhisperer / Amazon Q |
| User Search "Ability to search data assets with natural language" |
Universal Search |
Lakehouse IQ / Semantic Search |
N/A |
N/A |
N/A |
| Programmatic Access "Ability to use data platform services through code" |
SQL RestAPI / SnowSQL |
SQL Statement Execution API / CLI |
Azure CLI |
BiqQuery REST API / bq command-line tool |
AWS CLI |
| OLTP functionalities "Ability to serve same data assets in columnar and row format while enforcing integrity" |
Hybrid Tables |
Lakebase |
N/A |
N/A |
N/A |
| Notebook Experience "Ability to run code in Notebook manner" |
Snowflake Notebooks / Snowflake Notebooks with Container Runtime |
Databricks Notebooks |
Synapse Data Engineering |
Vertex AI Notebooks |
Amazon Sagemaker |
| Marketplace "Ability to buy, sell and search data products or add-ons to your data platform" |
YES |
YES |
YES |
YES |
YES |
| ETL "Ability to ingest data and process it without the need of 3rd party services" |
|
Lakeflow + Lakeflow Designer (new) |
Data Factory, Azure Functions |
Dataflow, Data Fusion, Cloud Run & Cloud Functions |
AWS Glue, Step Functions |
| Data Visualization "Ability to visualize data for application & reporting usage" |
Streamlit |
Dashboards |
Power BI |
Looker |
Amazon Quicksight |
| Streaming "Ability to ingest & process real-time data" |
Snowpipe Streaming (high performance architecture) (new) |
Spark Structured Streaming |
Synapse Real-Time Analytics, Event Hub, Azure Stream Analytics |
Storage Write API, Data Flow, Pub/Sub |
Amazon Kinesis Data Streams |
| Spark integration "Ability to run Spark" |
Snowpark Connector for Spark (new) |
Databricks Runtime |
Synapse Data Engineering |
Serverless Spark (new) |
EMR Serverless |

|
Snowflake’s 2025 announcements make it clear that the company is moving well beyond its warehouse roots. Snowflake wants to be the place where OLTP workloads, AI agents and data science coexist on the same governed data foundation. By launching Snowflake Postgres (private preview) and the open‑source pg_lake extension, Snowflake is tearing down the wall between transactional and analytical data: operational Postgres tables can now live inside the AI Data Cloud and query Iceberg tables without extract–load–transform. Meanwhile, Snowflake doubled down on AI with Cortex AISQL functions and Snowflake Intelligence, and invested in pipeline automation (Openflow), adaptive warehouses and Spark integration. Taken together, these releases show Snowflake’s ambition to become an AI‑native data compute fabric where SQL, Python, Spark and agents all run under one umbrella.
To see how this strategy plays out in practice, here are the 2025 releases that underpin Snowflake’s shift toward an AI‑native lakehouse.
Snowflake Intelligence (GA) – announced on 4 November, Snowflake Intelligence marries Anthropic and OpenAI models with Snowflake’s semantic layer so you can ask questions in plain English, get instant answers and auto‑generate charts. It fetches structured and unstructured data using Cortex Analyst and Cortex Search agents. The feature had been in preview and is now generally available.
|
Video 1: Talk To Your Data: Snowflake Intelligence Demo by Jeff Hollan |
Cortex AISQL functions (GA) – in the same update Snowflake shipped the GA version of its AISQL functions. New functions include AI_CLASSIFY (classify text and images), AI_TRANSCRIBE (transcribe audio and video), AI_EMBED (generate embedding vectors) and AI_SIMILARITY (measure embedding similarity). They let you perform AI‑powered tasks directly in SQL without calling external services.
Snowflake Openflow (GA) – the Apache NiFi‑based ingestion service became GA, letting you build data flows without spinning up your own orchestration. Openflow runs on Snowpark Container Services or on BYOC.
dbt Projects on Snowflake (GA) – native execution of dbt projects in Snowsight workspaces reached GA in November. The feature makes it easier to build, run and manage dbt projects inside Snowflake and improves performance and governance.
Snowflake Postgres (Public Preview) – Following the acquisition of Crunchy Data, Snowflake introduced Snowflake Postgres, a fully managed PostgreSQL‑compatible service. It maintains full compatibility with open‑source Postgres so customers can run operational workloads without rewriting code and still use existing extensions and client libraries. The service unifies transactional Postgres data with Snowflake’s AI Data Cloud, eliminating costly data movement and enabling context‑aware AI agents and applications. Snowflake has positioned the preview as the next step in its Unistore strategy for hybrid transactional‑analytical workloads.
pg_lake open‑source extension (GA) – To complement Snowflake Postgres, Snowflake open‑sourced pg_lake, a set of PostgreSQL extensions that allow developers to query, manage and write to Apache Iceberg tables directly from Postgres. The extensions let Postgres interact with Iceberg tables in object storage (CSV, Parquet and JSON) via standard SQL, giving Postgres users a direct path into Snowflake’s lakehouse without data extraction.
AI‑powered governance and Horizon Catalog enhancements – The Build announcements included new governance features in the Horizon Catalog, such as automated personally identifiable information (PII) detection and redaction, immutable backups and Tri‑Secret Secure encryption for hybrid tables. A new Copilot for Horizon Catalog uses Snowflake’s Cortex AI to answer security and governance questions through a conversational interface. These enhancements strengthen trust and compliance while giving administrators AI assistance.
Snowpark Connector for Spark – Snowflake launched a client connector that lets users run Apache Spark code directly on Snowflake’s vectorized analytics engine without spinning up separate Spark clusters. This Snowpark Connector promises up to 5.6× faster performance and about 40 % lower costs versus running Spark in a dedicated cluster. It enables organisations to reuse existing Spark DataFrame, Spark SQL and UDF code while benefiting from Snowflake’s security and governance.
Other notable updates – Snowflake also announced Standard Warehouse Gen 2, promising more than twice the performance of the previous warehouse generation; Semantic Views (public preview) for sharing business metrics; expanded Apache Iceberg read/write support; and Adaptive Compute, a private preview that auto‑scales warehouse resources. These are mostly still in preview but indicate Snowflake’s future direction.

|
Databricks continued to blur the lines between data warehouse, lakehouse and application platform in 2025. The flagship announcement was Lakebase, a Postgres‑compatible, serverless database that integrates with Delta/Iceberg tables and offers scale‑to‑zero and copy‑on‑write branching.
|
Video 2: Databricks Data + AI Summit Keynote Day 1 |
This positions Databricks to run operational workloads natively on the lakehouse. On the AI front, Agent Bricks and Databricks Apps allow non‑developers to build and deploy AI agents and interactive applications without managing infrastructure. Lakeflow and its drag‑and‑drop Lakeflow Designer simplify streaming and batch pipelines with serverless compute, while Unity Catalog enhancements enable full read/write on Iceberg tables and cross‑engine interoperability. Databricks remains highly capable but still requires more technical expertise than its competitors; nevertheless, it’s positioning itself as the platform where data engineers and data scientists can unify ETL, OLTP and AI agents under one roof.
Databricks’ 2025 announcements show the platform is expanding beyond its core Lakehouse roots, introducing a Postgres‑compatible transactional engine, production‑ready AI agents, unified data engineering pipelines and more. Here are some of the highlights:
|
Video 3: LakeFlow Designer demo |


|
After its splashy debut in 2024, Microsoft Fabric spent 2025 refining its building blocks rather than launching entirely new paradigms. The most impactful change was autoscale billing and job bursting for Spark workloads: compute now scales automatically and you pay only for execution time. Fabric also introduced variable libraries and Copilot inline code suggestions in notebooks (preview), making it easier to manage configuration and get AI‑powered coding help. Result set caching and a new SQL operator for Eventstream reduce query latency and let users write custom transformations in real‑time pipelines. Improvements to Data Factory include pipeline trigger management, incremental copy resets and REST APIs for deployment pipelines. While Fabric still feels like a work in progress compared with Snowflake or Databricks, Microsoft is clearly aiming to turn it into a one‑stop SaaS lakehouse with integrated governance, notebooks, real‑time analytics and Copilot assistance.
|
Video 4: Microsoft Fabric: The Data Platform for the AI Frontier |
These updates illustrate how Microsoft is tuning Fabric into a full SaaS lakehouse, with smarter scaling, better notebooks and real‑time intelligence.
monitoring: new panes show query history and execution duration.%%tsql magic command. This allows data engineers to combine SQL queries and Pandas transformations in the same notebook.

|
AWS focused 2025 on making Redshift more resilient and easier to integrate with operational databases rather than stuffing AI into the warehouse. The headline feature was History Mode for zero‑ETL pipelines, which automatically tracks every insert, update and delete when replicating data from Amazon Aurora or RDS, writing immutable history tables for point‑in‑time queries and SCD modelling. AWS also enabled cluster relocation by default for RA3 clusters and expanded Multi‑AZ deployments to new regions, raising the SLA to 99.99 % and enabling fast failover. Multidimensional Data Layouts (MDDL) went GA, dynamically sorting data to improve query performance up to 10×. Zero‑ETL integrations with RDS for Oracle and Aurora were rolled out to additional regions, and the auto‑copy continuous ingestion feature expanded to more regions. Redshift remains a sturdy choice for AWS‑centric architectures, but its AI story still depends on SageMaker/Bedrock rather than native warehouse features.
The releases that follow focus on strengthening Redshift’s zero‑ETL capabilities, reliability and query performance in 2025.


|
Google continued its steady evolution of BigQuery from an analytics engine into a full data platform. In 2025, BigQuery embraced the lakehouse trend with BigQuery Pipelines, Data Prep and a Data Science Agent for visual ETL and no‑code model building. The platform also added Semantic Search and Contribution Analysis to let users search unstructured data with embeddings and understand the drivers behind metric changes. The Universal Catalog and Business Glossary reached GA, enabling cross‑project dataset discovery and metadata governance. On the AI front, BigQuery introduced AI.SQL functions such as AI.IF, AI.CLASSIFY and AI.SCORE for filtering and classifying data via LLMs, and brought serverless Spark to BigQuery notebooks along with Vertex AI Agent Engine for multi‑agent LLM orchestration. While BigQuery remains a solid choice for Google‑centric environments, new features often lag Snowflake and Databricks; still, the 2025 releases show Google is serious about delivering an integrated lakehouse with built‑in AI.
|
dbt + BigQuery: What’s new from Google Cloud Next 25 |
The features below demonstrate how BigQuery is evolving from a pure analytics engine into a serverless lakehouse with built‑in AI and pipelines.
Optional job creation mode (GA) – announced 27 May, this mode automatically optimises small queries. and uses a cache to reduce latency. It’s off by default and can be enabled for dashboards and ad‑hoc queries.
Multi‑column data preparation tasks (GA) – on 22 May BigQuery Data Prep added the ability to operate on multiple columns at once, such as dropping them in one action. The feature is part of Gemini‑assisted data prep.
Dataplex automatic discovery (GA) – on 28 April, Dataplex automatic discovery in BigQuery began scanning Cloud Storage buckets and creating BigLake, external and object tables for analytics and AI.
Fine‑grained access control to the Iceberg metastore (GA) – on 21 April, BigQuery added fine‑grained permissions on Iceberg metastore tables and introduced new roles BigQuery Studio User and Gemini User. At the same time some older roles are no longer required.
Renaming of Analytics Hub and Dataplex Catalog – on 9 April Google announced that Analytics Hub is now BigQuery sharing and Dataplex Catalog is BigQuery universal catalog. The new names reflect the product’s broader role in data governance.
Iceberg resource management & Flink integration (GA) – on 8 April you can now create, edit and delete Apache Iceberg resources in the BigQuery metastore and connect it to Apache Flink.
Streaming Change Data Capture improvements (GA) – on 31 March _CHANGE_SEQUENCE_NUMBER became available to manage UPSERT ordering for streaming CDC, and you can include data‑prep tasks in BigQuery pipelines
Data Transfer Service custom reports for Google Ads (GA) – on 6 March the Data Transfer Service added support for Google Ads Query Language (GAQL), letting you ingest custom ad reports.
Gemini‑assisted Python code completion (GA) – on 3 March Gemini began providing context‑aware Python code suggestions in BigQuery Studio
Stockholm region availability – on 4 March BigQuery launched in europe‑north2 (Stockholm)

There are many smaller updates, such as metadata caching for SQL translation, inspecting BigQuery API dependencies and conditional IAM access on datasets. Google also teased features at Next ’25 such as BigQuery Continuous Queries for real‑time SQL over streaming data and multimodal ObjectRef tables, but those remain pre‑announcements.

CONCLUSION: Toward an AI‑native data platform
The 2025 releases show just how mature the cloud data warehouse market has become. Every major platform now offers the core capabilities needed to build modern data products: serverless compute, integrated ETL/ELT, machine‑learning runtimes and generative‑AI interfaces. Vendors are coalescing around open table formats such as Iceberg and Delta Lake and richer semantic layers, while also experimenting with agentic AI and operational databases inside their analytics platforms.
This means technology is no longer the bottleneck. Snowflake, Databricks and Google are leaning hard into multimodal SQL and AI agents; Microsoft Fabric is morphing into a fully managed SaaS lakehouse; and AWS is doubling down on reliability and zero‑ETL integrations. Different products have different strengths, but they’re all converging on the idea that your warehouse, lakehouse, app platform and AI workbench should live in one place.
None of this changes the old truth: no platform can rescue a poor data model. Whichever service you pick, the quality of your analytics still hinges on a well‑designed, well‑maintained model. Choose a modelling approach (Data Vault, Kimball, Inmon, whatever fits), automate with tools like dbt, and resist the urge to bolt together every shiny new thing. Hybrid architectures often underperform—let your model drive your platform choice and keep governance and lineage front and centre.
A built‑in vector database – many services now offer vector search, but none have a truly integrated vector store. A native solution would lower latency and cost for LLM‑powered applications.
Genuine cross‑platform tables – Iceberg shows promise, yet true write‑once/read‑anywhere is still aspirational. Being able to query the same table from Snowflake, Databricks and BigQuery without vendor‑specific connectors would be a game‑changer.
Deeper semantic layers – Snowflake’s Semantic Views and BigQuery’s business glossary are good starts, but we need first‑class metric definitions, lineage and security baked into every warehouse.
Agent governance and safety – as AI agents proliferate, vendors must provide robust evaluation and monitoring frameworks to ensure they act responsibly.
Sustainability metrics – with compute demands rising, built‑in carbon and energy reporting for queries and pipelines should become standard.
Native testing and lineage for pipelines – data platforms need the same CI/CD rigour as application development. Built‑in unit testing, continuous integration and lineage visualisation would raise the bar.
Location‑aware data controls – region and industry‑specific compliance rules are still largely DIY. Out‑of‑the‑box controls for residency, retention and privacy would ease adoption.
These wishes highlight a broader trend: data and AI platforms are merging, and with that merger come higher expectations for openness, accountability and sustainability.
Did you enjoy the article? In creating this blog, I listened to a lot Radio Suomi (I need background noise to work), utilized various tools such as ChatGPT Agent mode for finding relevant articles, Google for finding images and the actual release notes that the Agents didn't find, Paint+ for image creation and editing, and finally my trusty Wacom board for drawing. At Recordly, we encourage the use of any necessary tools to expand your knowledge and enhance your creative freedom.
We are currently seeking talented data engineers and data architects to join our amazing team of data professionals! Learn more about us and apply below 👇