

# Manage throughput quotas
<a name="genrel01"></a>


| GENREL01: How do you determine throughput quotas (or needs) for foundation models? | 
| --- | 
|   | 

Foundation models perform complex tasks over detailed input, and they have limited throughput on the amount of inference requests they can service at a time. This is particularly true for managed and serverless model hosting paradigms. Understanding and managing these quotas is crucial for maintaining reliable service levels and optimal performance. 

**Topics**
+ [GENREL01-BP01 Scale and balance foundation model throughput as a function of utilization](genrel01-bp01.md)