Blogs - Arun Verma

Blogs

MeMo: Memory as a Model May 2026

MeMo augments any LLM with up-to-date or domain-specific knowledge via a trained memory model, avoiding costly retraining, mitigating catastrophic forgetting, and remaining robust to retrieval noise.
MineDraft: A Framework for Batch Parallel Speculative Decoding Mar 2026

MineDraft accelerates large language model inference by overlapping the drafting and verification stages of speculative decoding, hiding latency and unlocking substantial throughput gains in batch settings.