Δ-Mem: Memoria Efficiente Online per Modelli di Lingua Grande

Δ-Mem: Efficient Online Memory for Large Language Models

Abstract

Large language models have achieved state-of-the-art performance in various natural language processing tasks. However, their large memory footprint makes them difficult to deploy on resource-constrained devices. In this paper, we propose a new memory-efficient approach for large language models, which we call Δ-Mem. Δ-Mem reduces the memory footprint of large language models by using a novel online memory allocation scheme. Our experiments show that Δ-Mem achieves significant memory savings while maintaining the accuracy of large language models.

Δ-Mem: Efficient Online Memory for Large Language Models

Comments (0)