NanoGPT API Key Setup: Beginner Guide + Troubleshooting

NanoGPT gives you access to 400+ AI models through a single OpenAI-compatible API. No subscription required — pay per prompt with crypto or card.

This guide covers everything: account creation, API key generation, your first API call, and fixing the errors you'll hit along the way.

What Is the NanoGPT API?

The NanoGPT API is a unified endpoint that routes your requests to different AI models. You send one request format, and NanoGPT handles the backend routing to GPT-4o, Claude, Llama, DeepSeek, and hundreds of others.

OpenAI-Compatible Endpoint

The API follows the OpenAI /v1/chat/completions format. This means:

  • Any code that works with OpenAI's API works with NanoGPT
  • Just change the base URL and API key
  • Libraries like openai-python, openai-node, and langchain work out of the box

400+ Models Through One API

Instead of managing separate accounts for OpenAI, Anthropic, Google, and Meta, you get everything through NanoGPT. Switch models by changing one parameter.

Pay-Per-Prompt Pricing

No monthly subscription. You deposit funds (crypto or card) and pay per API call. Costs vary by model:

  • GPT-4o: ~$0.005 — $0.01 per request
  • Claude 3.5 Sonnet: ~$0.003 — $0.008 per request
  • Llama 3 70B: ~$0.001 — $0.003 per request
  • DeepSeek V3: ~$0.0005 — $0.001 per request

Or get the $8/month flat rate for unlimited access to select models.

Step 1: Create Your NanoGPT Account

  1. Go to nanogpt.com
  2. Click "Sign Up"
  3. Enter your email (or use crypto wallet login)
  4. Verify your email

No KYC. No ID verification. Just an email.

Deposit Options

Before using the API, you need credits. NanoGPT accepts:

Payment MethodMin DepositProcessing Time
Monero (XMR)$1~2 minutes
Bitcoin (BTC)$110 — 30 minutes
Bitcoin Lightning$1Instant
Nano (XNO)$1Instant
Credit Card$10Instant

For privacy, use Monero or Nano. For speed, use Lightning or Nano.

Step 2: Generate Your API Key

  1. Log into your NanoGPT dashboard
  2. Navigate to SettingsAPI Keys
  3. Click "Generate New Key"
  4. Configure permissions (see below)
  5. Copy the key immediately — it's shown once

Your key looks like: ng-a1b2c3d4e5f6g7h8i9j0...

Key Permissions

When creating a key, you can set granular permissions:

Chat Completions — Required. This is the main API endpoint for text generation.

Image Generation — Optional. Enables access to image models (DALL-E, Stable Diffusion, etc.).

Model Listing — Recommended. Lets your code discover available models programmatically.

Embeddings — Optional. Only needed if you're building RAG or semantic search systems.

Rate Limit — You can set a per-key rate limit to prevent runaway costs. Recommended for production use.

Security Best Practices

  • Don't hardcode keys in source code. Use environment variables.
  • Use separate keys for development and production.
  • Set rate limits on keys used in public-facing applications.
  • Rotate keys if you suspect they've been compromised.

Step 3: Make Your First API Call

cURL Example

curl https://api.nanogpt.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "Hello, what models are available?"}
    ],
    "max_tokens": 100
  }'

Replace YOUR_API_KEY with your actual key. The response is standard OpenAI format:

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "I can help you with various tasks..."
      }
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 25,
    "total_tokens": 37
  }
}

Python Example

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.nanogpt.com/v1"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Explain quantum computing in one paragraph."}
    ],
    max_tokens=200
)

print(response.choices[0].message.content)

Install the library: pip install openai

JavaScript Example

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://api.nanogpt.com/v1'
});

async function main() {
  const response = await client.chat.completions.create({
    model: 'gpt-4o',
    messages: [
      { role: 'user', content: 'What is the meaning of life?' }
    ],
    max_tokens: 150
  });

  console.log(response.choices[0].message.content);
}

main();

Install the library: npm install openai

API Key Permissions Explained

Chat Completions

The core endpoint. Sends a conversation to the model and gets a response. Used by:

  • Chatbots
  • SillyTavern
  • LangChain applications
  • Any OpenAI-compatible client

Image Generation

Access to image models through the /v1/images/generations endpoint. Models include DALL-E 3, Stable Diffusion XL, and others.

Model Listing

The /v1/models endpoint returns all available models. Useful for building dynamic model selectors in your application.

Troubleshooting API Errors

401 Unauthorized — Key Invalid or Expired

What it means: Your API key is wrong, revoked, or missing.

Fixes:

  1. Check for typos or extra spaces in the key
  2. Verify the key is active in your NanoGPT dashboard
  3. Make sure you're using the key as a Bearer token: Authorization: Bearer YOUR_API_KEY
  4. If the key was revoked, generate a new one

429 Rate Limited — Too Many Requests

What it means: You've hit NanoGPT's rate limit.

Fixes:

  1. Add a delay between requests (start with 1 second)
  2. Implement exponential backoff in your code
  3. Upgrade to the $8/month plan for higher limits
  4. Set a per-key rate limit in NanoGPT dashboard to stay within bounds

Example backoff in Python:

import time
from openai import OpenAI, RateLimitError

client = OpenAI(api_key="YOUR_API_KEY", base_url="https://api.nanogpt.com/v1")

def call_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o",
                messages=messages
            )
        except RateLimitError:
            wait = 2 ** attempt
            print(f"Rate limited. Waiting {wait}s...")
            time.sleep(wait)
    raise Exception("Max retries exceeded")

402 Payment Required — Insufficient Balance

What it means: Your NanoGPT account has no credits.

Fixes:

  1. Deposit more credits at nanogpt.com
  2. Check your balance in the dashboard
  3. If using crypto, wait for confirmation (BTC: 10 — 30 min, XMR: ~2 min, XNO: instant)

Model Not Available

What it means: The model name is wrong or the model is temporarily unavailable.

Fixes:

  1. List available models: GET /v1/models
  2. Check spelling (model names are case-sensitive)
  3. Some models have region restrictions — try a VPN
  4. Popular models can be temporarily unavailable during peak hours

NanoGPT API vs OpenAI API

Why use NanoGPT instead of going directly to OpenAI?

FeatureNanoGPT APIOpenAI API
Models400+ (GPT, Claude, Llama, etc.)GPT + DALL-E only
PricingPay-per-prompt or $8/month flatPay-per-token (expensive)
Crypto paymentsYes (XMR, BTC, XNO)No
KYCNoneRequired
PrivacyNo logs claimedLogs everything for 30 days
Rate limitsGenerousStrict
CompatibilityOpenAI-compatibleNative

The main advantage: NanoGPT is a superset. You get everything OpenAI offers, plus Claude, Llama, DeepSeek, and 390+ other models. With better privacy and cheaper pricing.

Real-World API Usage Examples

Here are practical examples beyond basic chat:

Summarization

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "Summarize the following text in 3 bullet points."},
        {"role": "user", "content": "Your long text here..."}
    ]
)

Code Generation

response = client.chat.completions.create(
    model="claude-3.5-sonnet",
    messages=[
        {"role": "user", "content": "Write a Python function that validates email addresses using regex."}
    ],
    max_tokens=500
)

Streaming Responses

For real-time output (chatbots, UIs), use streaming:

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Streaming reduces perceived latency. The first token arrives in 0.5 — 2 seconds, even if the full response takes 10+ seconds.

FAQ

Is NanoGPT API free?

No. You need to deposit credits. But there's no subscription fee for basic access. You only pay for what you use. The $8/month plan offers unlimited access to select models if you're a heavy user.

How many API keys can I create?

Unlimited. Create as many as you need — one for development, one for production, one for each project. Each key can have different permissions and rate limits.

Does NanoGPT log my API requests?

NanoGPT claims they don't store request/response data. This is better than OpenAI (30-day retention) and most other providers. If logging is a concern, check their current privacy policy or use a local model through privacy-focused AI tools.

Can I use NanoGPT API with LangChain?

Yes. Set the OPENAI_API_BASE environment variable to https://api.nanogpt.com/v1 and use your NanoGPT key as OPENAI_API_KEY. LangChain will route all requests through NanoGPT.

What's the maximum context length?

Depends on the model. GPT-4o: 128K tokens. Claude 3.5: 200K tokens. Llama 3: 8K — 128K tokens depending on version. Check the model listing endpoint for specifics.


API documentation: nanogpt.com/docs. Last updated: July 2026.

For more guides on using AI privately, visit AI Privacy Tools.