Will My IP Get Banned? Detailed Explanation of Binance API Rate Limits and Weights

2026-04-14 · 22 min · FlyVault Editorial

A complete guide to Binance API weights and rate-limiting mechanisms: 6000-weight-per-minute allocation, endpoint weight lists, reading response headers, 429/418 triggers, token bucket implementation, and WebSocket alternatives, including Python rate-limiting code.

The core mechanism of the Binance API rate limit is the Weight Bucket: each IP can consume a maximum of 6000 weight per minute (Spot). Each endpoint deducts between 1 and 100 weight points based on its complexity. Exceeding this limit returns a 429 Too Many Requests error, and repeated violations can escalate to a 418 error, resulting in an IP ban ranging from 2 minutes to 3 days. This article provides a complete guide—covering weight calculation, response header monitoring, client-side rate limiting, and WebSocket alternatives—to ensure your strategy runs stably without triggering risk controls. Users who do not yet have an API key should complete KYC on the Binance Official Website; those without an account can use Free Registration.

1. The Three Dimensions of Rate Limiting

Binance imposes three independent limits on the same IP / API Key:

Dimension	Spot Limit	Futures Limit	Violation Response
Request Weight (REQUEST_WEIGHT)	6000 / minute	2400 / minute	429
Order Count (ORDERS)	100 / 10s; 200,000 / day	300 / 10s; 1200 / minute	429
Connection Count (RAW_REQUESTS)	61,000 / 5 minutes	61,000 / 5 minutes	429
IP Ban	Repeated violations	Repeated violations	418

Key Insight: Although the weight of an ordering endpoint is only 1, it simultaneously consumes the ORDERS bucket. Hitting the limit of either bucket will trigger a rate limit.

2. Querying Current Weight Quotas

You can retrieve the real-time quotas for your account via the rateLimits field of GET /api/v3/exchangeInfo:

curl -s "https://api.binance.com/api/v3/exchangeInfo" | \
  jq '.rateLimits'

Returns:

[
  {"rateLimitType": "REQUEST_WEIGHT", "interval": "MINUTE", "intervalNum": 1, "limit": 6000},
  {"rateLimitType": "ORDERS", "interval": "SECOND", "intervalNum": 10, "limit": 100},
  {"rateLimitType": "ORDERS", "interval": "DAY", "intervalNum": 1, "limit": 200000},
  {"rateLimitType": "RAW_REQUESTS", "interval": "MINUTE", "intervalNum": 5, "limit": 61000}
]

Accounts with higher VIP levels can apply for higher weight limits, but the standard 6000 weight is sufficient for 90% of strategies.

3. Common Endpoint Weight Comparison Table

Endpoint	Weight	Description
GET /api/v3/ping	1	Connectivity test
GET /api/v3/time	1	Server time
GET /api/v3/exchangeInfo	20	Trading rules (cache for 1 hour)
GET /api/v3/ticker/price (single symbol)	1	Single symbol price
GET /api/v3/ticker/price (all)	4	Fetch all prices at once
GET /api/v3/ticker/24hr (single symbol)	1	Single symbol 24h stats
GET /api/v3/ticker/24hr (all)	80	Stats for all pairs
GET /api/v3/depth limit=5/10/20/50/100	1	Order book depth
GET /api/v3/depth limit=500	5	Order book depth
GET /api/v3/depth limit=1000	10	Order book depth
GET /api/v3/depth limit=5000	50	Order book depth (use with caution)
GET /api/v3/klines	2	K-lines (candlesticks)
GET /api/v3/historicalTrades	5	Historical trade data
GET /api/v3/account	20	Account balances
GET /api/v3/openOrders (single symbol)	6	Current open orders
GET /api/v3/openOrders (all)	80	All open orders
GET /api/v3/allOrders	20	Historical orders
POST /api/v3/order	1	Create order
DELETE /api/v3/order	1	Cancel order
DELETE /api/v3/openOrders	1	Cancel all open orders

Performance Trap: Avoid looping calls to ticker/24hr for individual symbols. Use the parameterless endpoint to fetch all at once, reducing weight from 300 symbols × 1 = 300 down to 80.

4. Reading Response Headers for Adaptive Rate Limiting

Every REST request response includes the currently used weight. Clients should read this to adjust dynamically:

import requests, time

BASE_URL = "https://api.binance.com"

class RateLimiter:
    def __init__(self, max_weight=6000, safety_ratio=0.8):
        self.max_weight = max_weight
        self.safety = safety_ratio  # Use only 80% to prevent cold-start errors
        self.used_weight = 0

    def update_from_headers(self, headers: dict):
        used = headers.get("X-MBX-USED-WEIGHT-1m")
        if used:
            self.used_weight = int(used)

    def should_wait(self) -> float:
        """Returns recommended wait seconds; 0 means ready to call."""
        threshold = self.max_weight * self.safety
        if self.used_weight >= threshold:
            # Estimate seconds until next minute reset
            return 60 - (int(time.time()) % 60)
        return 0

limiter = RateLimiter()

def safe_get(path, params=None):
    wait = limiter.should_wait()
    if wait > 0:
        print(f"[Rate Limit] {limiter.used_weight} weight used, pausing for {wait}s")
        time.sleep(wait)
    r = requests.get(f"{BASE_URL}{path}", params=params, timeout=10)
    limiter.update_from_headers(r.headers)
    return r.json()

# Usage
data = safe_get("/api/v3/ticker/24hr")
print(f"Current weight usage: {limiter.used_weight}/6000")

5. Token Bucket Rate Limiting (Proactive Client Control)

A more reliable approach than passively watching headers is a Client-side Token Bucket:

import time, threading

class TokenBucket:
    def __init__(self, capacity=6000, refill_per_sec=100):
        self.capacity = capacity
        self.tokens = capacity
        self.refill = refill_per_sec  # 6000/60 = 100 per second
        self.last = time.time()
        self.lock = threading.Lock()

    def acquire(self, cost=1):
        with self.lock:
            now = time.time()
            elapsed = now - self.last
            self.tokens = min(self.capacity, self.tokens + elapsed * self.refill)
            self.last = now
            if self.tokens < cost:
                wait = (cost - self.tokens) / self.refill
                time.sleep(wait)
                self.tokens = 0
            else:
                self.tokens -= cost

bucket = TokenBucket(capacity=6000, refill_per_sec=100)

def call(path, weight):
    bucket.acquire(weight)
    return requests.get(f"{BASE_URL}{path}").json()

# Use for both creating orders (weight 1) and checking balances (weight 20)
call("/api/v3/ticker/price", 1)
call("/api/v3/account", 20)

6. Correct Handling of 429 and 418 Errors

1. Receiving a 429

def request_with_retry(method, url, **kwargs):
    for attempt in range(3):
        r = requests.request(method, url, **kwargs)
        if r.status_code == 429:
            retry_after = int(r.headers.get("Retry-After", 60))
            print(f"Rate limit triggered, sleeping for {retry_after}s")
            time.sleep(retry_after)
            continue
        if r.status_code == 418:
            print("IP banned, application must stop!")
            raise SystemExit(1)
        return r
    raise Exception("Max retries exceeded")

The Retry-After header provides the exact number of seconds to wait; simply sleep for that duration.

2. Receiving a 418

A 418 error is a severe warning: Continuing to send requests after a 429 will result in an IP ban starting at 2 minutes, potentially escalating to 3 days. Once received, all requests must stop immediately, and you must wait at least the duration specified by Retry-After before resuming.

7. WebSocket Alternative: Near-Zero Weight Consumption

REST polling consumes weight with every request. WebSocket subscriptions only consume weight once during the initial connection; real-time pushes do not count toward weight:

import json, websocket

def on_message(ws, message):
    data = json.loads(message)
    print(f"{data['s']} Price {data['c']}, 24h Vol {data['v']}")

ws = websocket.WebSocketApp(
    "wss://stream.binance.com:9443/stream?streams=btcusdt@ticker/ethusdt@ticker",
    on_message=on_message
)
ws.run_forever()

Cost Comparison: Polling real-time tickers for 10 pairs via REST every second = 600 calls/minute × 1 weight = 600 weight. WebSocket subscription = 0 weight.

8. Practical Weight Optimization Tips

Cache exchangeInfo: Updating once per hour is sufficient; higher frequencies just waste 20 weight per call.
Prioritize Batch Queries: Fetching all tickers at once without parameters via /ticker/24hr saves 90% weight compared to individual loops.
Only Fetch Necessary Depth: Limit=20 is enough for most limit orders (weight 1); don't pull 5000 levels.
Use userDataStream for Order Status: It is more efficient and consumes zero weight compared to periodic GET /order calls.
Use Batch Cancel for All Orders: DELETE /openOrders?symbol=BTCUSDT costs 1 weight, saving more than individual cancels.
Time-based Limiting: Settlement windows (UTC 00:00, 08:00, 16:00) often have higher weight consumption; adjust your strategy to avoid these peaks.

9. FAQ

Q1: Is the weight calculated by IP or API Key?

A: Primarily by IP. Multiple keys under the same IP share the weight bucket. Switching IPs can bypass IP-level weight limits, but order count limits are still tied to the account. Using proxies to rotate IPs is identifiable by Binance and not recommended.

Q2: What is the difference between X-MBX-USED-WEIGHT and X-MBX-USED-WEIGHT-1m in response headers?

A: X-MBX-USED-WEIGHT-1m is the officially recommended header, explicitly denoting the 1-minute window. X-MBX-USED-WEIGHT is a legacy field with the same value. Use the one with the -1m suffix.

Q3: How long will I be banned after a 418?

A: The first violation is usually 2 minutes, escalating to 5/15/60 minutes, and up to 3 days in extreme cases. The Retry-After header gives the precise time. Optimize your code immediately upon recovery to prevent escalating bans.

Q4: How do I calculate weight for concurrent calls in a multi-threaded app?

A: Use a process-wide global token bucket (see section 5) or a distributed Redis-based token bucket for multi-machine deployments. A threading.Lock is sufficient for a single machine.

Q5: Are the weight rules the same on the testnet (testnet.binance.vision)?

A: Testnet weight limits are generally more relaxed (often 10x the mainnet), but the rule structure is identical. Do not estimate mainnet performance based on testnet consumption; always validate on the mainnet with low traffic before going live.

After reviewing rate-limiting strategies, return to the Category Navigation to select the "API Integration" category for WebSocket and signature tutorials.

Keep reading

Still have Binance questions? Head back to the category page for more tutorials on the same topic.