ADMS 2025
Sixteenth International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures
 

In conjunction with VLDB 2025, London, U.K.
Monday, September 1, 2025
 
 
  Links
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Workshop Overview

The objective of this one-day workshop is to investigate opportunities in accelerating analytics workloads and data management systems. Over the years, the scope of database analytics has changed substantially, beginning with traditional OLAP, data warehousing, ETL, to HTAP, Streaming/Real-time Processing, Edge/IoT, and finally to machine learning and deep learning workloads such as Generative AI or Vector/semantic databases. Increasing use of Large Language Models(LLMs) for as a source for knowledge extraction for various end uses (e.g., in an AI assistant or Agentic system), creates new opportunities for database systens. At the same time, hardware and software capabilities have seen tremendous improvements. The workshop aims to explore how database analytics can be accelerated using modern processors (e.g., commodity and specialized Multi-core, Many-core, chiplets, GPUs, and FPGAs), processing systems (e.g., hybrid, massively-distributed clusters, and cloud based distributed computing infrastructure), networking infrastructures (e.g., RDMA over InfiniBand), memory and storage systems (e.g., storage-class Memories like SSDs, active memories, NVRAM, and Phase-change Memory), multi-core and distributed programming paradigms like CUDA, MPI/OpenMP, and MapReduce/Spark, and integration with data-science/deep-learning frameworks such as Sklearn, TensorFlow, or PyTorch. Exploratory topics such as DNA-based storage or quantum algorithms are also within the preview of the ADMS workshop. The intent of the ADMS workshop is to bring together people from diverse fields such as computer architecture, high-performance computing, systems, and programming languages to address key functionality and scalability problems in data management.

The current data management scenario is characterized by the following trends: traditional OLTP and OLAP/data warehousing systems are being used for increasing complex workloads (e.g., integration of various AI technologies, Petabyte of data, complex queries under real-time constraints, etc.); applications are becoming far more distributed, often consisting of different data processing components; non-traditional domains such as bio-informatics, social networking, mobile computing, sensor applications, gaming are generating growing quantities of data of different types; economical and energy constraints are leading to greater consolidation and virtualization of resources; and analyzing vast quantities of complex data is becoming more important than traditional transactional processing.

At the same time, there have been tremendous improvements in the CPU and memory technologies. Newer processors are more capable in the compute and memory capabilities, are power-efficient, and are optimized for multiple application domains. Commodity systems are increasingly using multi-core processors with more than 6 cores per chip and enterprise-class systems are using processors with at least 32 cores per chip. Specialized multi-core processors such as the GPUs have brought the computational capabilities of supercomputers to cheaper commodity machines. On the storage front, FLASH-based solid state devices (SSDs) are becoming smaller in size, cheaper in price, and larger in capacity. Exotic technologies like Phase-change memory are on the near-term horizon and can be game-changers in the way data is stored and processed.

In spite of the trends, currently there is limited usage of these technologies in data management domain. Naive exploitation of multi-core processors or SSDs often leads to unbalanced systems. It is, therefore, important to evaluate applications in a holistic manner to ensure effective utilization of CPU and memory resources. This workshop aims to understand impact of modern hardware technologies on accelerating core components of data management workloads. Specifically, the workshop hopes to explore the interplay between overall system design, core algorithms, query optimization strategies, programming approaches, performance modelling and evaluation, etc., from the perspective of data management applications.

Topics of Interest

The suggested topics of interest include, but are not restricted to:

  • Hardware and System Issues in Domain-specific Accelerators
  • New Programming Methodologies for Data Management Problems on Modern Hardware
  • Query Processing for Hybrid Architectures
  • Large-scale I/O-intensive (Big Data) Applications
  • Parallelizing/Accelerating Machine Learning/Deep Learning Workloads
  • Accelerating training, inference, and storage of Large Language Models for Generative AI
  • Autonomic Tuning for Data Management Workloads on Hybrid Architectures
  • Algorithms for Accelerating Multi-modal Multi-tiered Systems
  • Applications of GPUs and other data-parallel accelerators
  • Energy Efficient Software-Hardware Co-design for Data Management Workloads
  • Parallelizing non-traditional (e.g., graph mining) workloads
  • Algorithms and Performance Models for modern Storage Sub-systems
  • Exploitation of specialized ASICs
  • Novel Applications of Low-Power Processors and FPGAs
  • Exploitation of Transactional Memory for Database Workloads
  • Exploitation of Active Technologies (e.g., Active Memory, Active Storage, and Networking)
  • New Benchmarking Methodologies for Accelerated Workloads
  • Applications of HPC Techniques for Data Management Workloads
  • Acceleration in the Cloud Environments
  • Accelerating Data Science/Machine Learning Workloads
  • Exploratory topics such as Generative AI, DNA-storage, Quantum Technologies

Keynote Presentations

Title: Using what we know about our data: compact metadata and amortisation to exploit locality and sparsity Prof. Paul H J Kelly, Imperial College, London

Abstract: Parallelism is basically easy. Especially when the algorithms are dumb. But when we get smarter, and use more complicated data structures to support strategies that avoid redundant computation, things get messier. We need metadata that maps where we can find the non-zeroes, and where the repeated values are. This makes locality, parallelization and scheduling dependent on the actual data. This talk is about processing the metadata at runtime to determine how the computation itself should be done. The talk isn't about new research results; instead we map the problem space and some of the techniques, drawing on our diverse experience.

Title: On the role of storage and hardware acceleration in modern data management systems Vincent Hsu, IBM Storage, and Haris Pozidis, IBM Research.

Abstract: We are living in the era of data. Data is being generated and stored at unprecedented scales, estimated at 150 zettabytes in 2024, more than 90% of which is unstructured. This data can only be useful if it is used to generate insights, drive decisions and improve business processes. While this has been the case with structured data, using relational data management systems, such capabilities have been largely absent for unstructured data. With generative AI (GenAI) it is possible to process unstructured data, extract and encode (embed) semantic nibbles of information and organize these nibbles in so-called vector management systems (vector DBs), similarly to traditional relational databases. Analogous to SQL queries, vector DBs can be searched with natural language queries, using vector similarity search technology.

Data retrieval is a critical piece of information extraction and the driving mechanism of knowledge generation. By combining the efficiency and determinism of SQL search in relational DBs with the expressiveness of similarity search in vector DBs, one obtains a powerful combination, able to extract the most hidden of data insights. This is the first pillar of modern data management systems.

Another key trend in data management technology is the convergence of storage and data processing systems. Storage is where data live, access controls are maintained and where security can be easily enforced. This is also the most natural, cost-savvy, and energy efficient place to execute the entire data transformation pipeline. Hybrid relational and vector DBs can be embedded in the storage system for higher data security, data freshness and cost control. We argue that this is the second pillar of modern data management systems.

HW acceleration, offered by GPUs or other specialized accelerators, has been the enabler of the GenAI era, offering unprecedented processing capacity. In addition to its application in foundation model training and inference, HW acceleration has a pivotal role to play in data processing pipelines and hybrid search systems. GPUs and other accelerators can offer dramatic improvements in energy/performance efficiency and enable levels of scalability that are beyond the reach of traditional CPU-based computing. HW acceleration is the third pillar in the evolution of data management systems.

In this talk we will be discussing the above three main pillars of modern data management systems, as they are evolving to cater to the needs of GenAI and agentic applications and workflows.

Workshop Program (8.30am-5pm)

  • (8.35-9 am) High Throughput GPU-Accelerated FSST String Compression Tim Anema, Delft University of Technology, Joost Hoozemans, Voltron Data, Zaid Al-Ars, Delft University of Technology, and H. Peter Hofstee, IBM
  • (9-9.25 am) GPU-Accelerated Stochastic Gradient Descent for Scalable Operator Placement in Geo-Distributed Streaming Systems Tristan Joel Terhaag, Technische Universität Berlin, Xenofon Chatziliadis, Technische Universität Berlin, Eleni Tzirita Zacharatou, Hasso Plattner Institute, University of Potsdam, and Volker Markl, Technische Universität Berlin
  • (9.30-9.55 am) A Hot Take on the Intel Analytics Accelerator for Database Management Systems Christos Laspias, Andrew Pavlo and Jignesh Patel, Carnegie Mellon University
  • (10-10.30 am) Morning Break
  • (10.30-10.55 am) A Data Aggregation Visualization System supported by Processing-in-Memory Junyoung Kim, Madhulika Balakumar and Kenneth Ross, Columbia University
  • (11 am-12 pm) Keynote 1: Using what we know about our data: compact metadata and amortisation to exploit locality and sparsity Prof. Paul H J Kelly, Imperial College, London
  • (12-1.30 pm) Lunch Break
  • (1.30-1.55 pm) Demystifying CXL Memory Bandwidth Expansion for Analytical Workloads Georgiy Lebedev, Hamish Nicholson, Musa Ünal, Sanidhya Kashyap and Anastasia Ailamaki, EPFL
  • (2-3 pm) Keynote 2: On the role of storage and hardware acceleration in modern data management systems Vincent Hsu, IBM Storage, and Haris Pozidis, IBM Research
  • (3-3.30 pm) Afternoon Break
  • (3.30-3.55 pm) CXL-Bench: Benchmarking Shared CXL Memory Access Marcel Weisgut, Hasso Plattner Institute, University of Potsdam, Daniel Ritter, SAP, Florian Schmeller, Hasso Plattner Institute, University of Potsdam, Pınar Tözün, IT University of Copenhagen, and Tilmann Rabl, Hasso Plattner Institute, University of Potsdam
  • (4-4.25 pm) RISC-V Meets RDBMS: An Experimental Study of Database Performance on an Open Instruction Set Architecture Yizhe Zhang, Zhengyi Yang, Bocheng Han, University of New South Wales, Haoran Ning, Macquarie University, Xin Cao, John Shepherd, University of New South Wales, and Guanfeng Liu, Macquarie University
  • (4.30-4.55 pm) Micro-architectural Exploration of the Relational Memory Engine (RME) in RISC-V and FireSim Cole Strickler, University of Kansas, Ju Hyoung Mun, Brandeis University, Connor Sullivan, University of Kansas, Denis Hoornaert, Technical University of Munich, Renato Mancuso, Manos Athanassoulis, Boston University, and Heechul Yun, University of Kansas
Organization

Workshop Co-Chairs

       For questions regarding the workshop please send email to contact@adms-conf.org.

Program Committee

  • Francesco Fusco, IBM Research, Zurich
  • Wentao Huang, National University of Singapore
  • Julia Spindler, TUM
  • Selim Tekin, Georgia Tech
  • Hubert Mohr-Daurat, Imperial College, London
  • Rathijit Sen, Microsoft
  • Hong Min, IBM T. J. Watson Research Center
  • Viktor Sanca, Oracle
  • Subhadeep Sarkar, Brandeis University

Important Dates

  • Paper Submission: Monday, 26 May, 2025, 9 am EST
  • Notification of Acceptance: Friday, 20 June, 2025
  • Camera-ready Submission: Friday, 18 July, 2025
  • Workshop Date: Monday, 1 September, 2025

Submission Instructions

Submission Site 

All submissions will be handled electronically via EasyChair.

Publication and Formatting Guidelines 

The ADMS'25 proceedings will be published as a part of the official VLDB Workshop Proceedings and indexed via DBLP.

We will use the same document templates as the VLDB conference. You can find them here.

It is the authors' responsibility to ensure that their submissions adhere strictly to the VLDB format detailed here. In particular, it is not allowed to modify the format with the objective of squeezing in more material. Submissions that do not comply with the formatting detailed here will be rejected without review. 

As per the VLDB submission guidelines, the paper length for a full paper is limited to 12 pages, excluding bibliography. However, shorter papers (at least 6 pages of content) are encouraged as well.