Abstract: The deployment of Large Language Models (LLMs) is rapidly expanding across diverse applications, necessitating cost-effective and resource-efficient strategies to optimize their usage. This ...