Δ-Mem: Efficient Online Memory for Large Language Models
Abstract
Large language models have achieved state-of-the-art performance in various natural language processing tasks. However, their large memory footprint makes them difficult to deploy on resource-constrained devices. In this paper, we propose a new memory-efficient approach for large language models, which we call Δ-Mem. Δ-Mem reduces the memory footprint of large language models by using a novel online memory allocation scheme. Our experiments show that Δ-Mem achieves significant memory savings while maintaining the accuracy of large language models.
Comments (0)
Login or Register to apply