General · Java · Spring

Throttling and Quota Management

Statement

With the initiative of Company’s API-First approach, Service is moving faster towards Self-Service mode.  In this world of APIs, there is no limit to access your resources henceforth in the interest of developers and customers, We have decided to limit the access to POS APIs. And that’s how the idea of throttling and user quota came into picture and we want to implement the same in Service.

API Throttling

  • API Throttling is a way by which we can control on the usage of APIs by different clients and developers.
  • Generally it is measured in terms of Requests per Sec/Minute/Hour/Day/Week/Month/Year etc.
  • One can associate the throttling on behalf of Request Type (POST/PUT/GET), API end point etc.
  • When the configures limit is exceeded, user gets message saying “Too many requests” with Response status code as 429.

There are mainly 2 types of throttling :

Soft: In this type, when the number of API requests exceeds the configured percentage of throttle limit (70 or 80), service is supposed to send the alert to the user.

Hard: In this type, the number of API requests can’t exceed the configured threshold limit.

In the Dropwizard application, we can implement the same using RateLimiter Class or @Throttling annotation. This mechanism is designed to have very low overhead, counts the number of requests made with the token in the throttling time period and compares this with the allowed number of requests. If an access token is throttled, requests using it are denied access until a full throttling period passes, after which it can begin accessing the API again with zero throttling count.

User Quota Management 

  • User Quota is somewhat similar to the API Throttling but applies to the collection of client keys like x-api-key.
  • The quota limit varies from client to client depending on their load and requirements.
  • Once quota is full, it requires an automatic or manual reset to allow any subsequent requests with a given API key.
  •  it is also measured in terms of Requests per Sec/Minute/Hour/Day/Week/Month/Year etc.
  • Generally when both the quota and throttling are configured for a client, API Gateway first applies throttling conditions, and based on whether the request was successful, increases the quota count for the API key.

Monitoring and Alerting

Once Throttling and User Quota is enabled in service, we can push these metrics through Observability and same can be seen on the Grafana Dashboards. There, we can can have alerting mechanism in the form of mail/slack when certain throttling or quota limit is exceeded for the clients.

Note* Companies have its own throttling policies. So as soon as request is made on the application, it is firstly routed through company’s gateway and certain throttling rules are applied there only. Application API throttling policies will fall after comapany’s throttling policies.

Leave a comment