Durasi: 10 menit | Block: 5
๐ง Key Principle โ
Cheap model for ROUTING & CLASSIFICATION
Strong model for REASONING & GENERATION๐ Model Comparison (2026) โ
| Model | Cost/Input (1M) | Cost/Output (1M) | Speed | Reasoning | Best For |
|---|---|---|---|---|---|
| GPT-4o-mini | $0.15 | $0.60 | โกโกโก | โญโญ | Classification, simple tasks |
| GPT-4o | $2.50 | $10 | โกโก | โญโญโญโญ | General purpose, complex tasks |
| Claude 3.5 Sonnet | $3 | $15 | โกโก | โญโญโญโญโญ | Writing, analysis, nuanced tasks |
| Claude 3 Haiku | $0.25 | $1.25 | โกโกโก | โญโญ | Quick classification, routing |
| Gemini 1.5 Flash | $0.075 | $0.30 | โกโกโก | โญโญ | Budget, high volume |
| Gemini 1.5 Pro | $1.25 | $5 | โกโก | โญโญโญโญ | Long context, multi-modal |
๐ฏ Task-to-Model Mapping โ
Tier 1: Cheap & Fast (Classification, Routing) โ
Model: GPT-4o-mini / Claude Haiku / Gemini Flash
Cost: ~$0.001 per task
Tasks:
- "Classify inquiry ini: sales/support/billing"
- "Extract nama dan email dari teks ini"
- "Is this urgent? yes/no"
- "Translate ke bahasa Inggris"
- "Summarize in 1 sentence"Tier 2: Mid-Range (Generation, Formatting) โ
Model: GPT-4o / Gemini Pro
Cost: ~$0.01-0.05 per task
Tasks:
- "Draft email follow-up dengan tone profesional"
- "Generate 5 konten ideas untuk social media"
- "Format data ini jadi report"
- "Write product description"Tier 3: Premium (Complex Reasoning, Analysis) โ
Model: Claude Sonnet / GPT-4o
Cost: ~$0.05-0.50 per task
Tasks:
- "Analisis data penjualan dan berikan insight"
- "Research competitor dan buat positioning strategy"
- "Draft kontrak berdasarkan requirement complex"
- "Evaluate dan improve draft yang sudah ada"๐ฐ Cost Calculator per Pattern โ
Pattern: Daily Brief (Cron) โ
Task: Analyze data + generate brief (Tier 2)
Frequency: 1x/hari
Model: GPT-4o
Per execution:
- Input: ~2000 tokens = $0.005
- Output: ~1000 tokens = $0.01
- Total: $0.015
Per bulan: $0.015 ร 30 = $0.45 โ Rp 7.000Pattern: Customer Auto-Reply โ
Task: Classify + draft reply (Tier 1 + Tier 2)
Frequency: 50x/hari
Classification (Tier 1 - Haiku):
- Per inquiry: ~$0.0002
- Per hari: $0.0002 ร 50 = $0.01
Draft reply (Tier 2 - GPT-4o):
- Per inquiry: ~$0.01
- Per hari: $0.01 ร 50 = $0.50
Per bulan: ($0.01 + $0.50) ร 30 = $15.30 โ Rp 245.000Pattern: Weekly Research Report โ
Task: Research + analyze + report (Tier 3)
Frequency: 1x/minggu
Model: Claude Sonnet
Per execution:
- Input: ~5000 tokens = $0.015
- Output: ~3000 tokens = $0.045
- Web search: ~5 calls
- Total: ~$0.10
Per bulan: $0.10 ร 4 = $0.40 โ Rp 6.400โก Optimization Tips โ
- Route first, generate second โ classify pakai model murah, baru generate pakai model kuat
- Cache results โ kalau pertanyaan sama, ga perlu call ulang
- Batch when possible โ kirim multi-task sekaligus, hemat overhead
- Set max tokens โ limit output biar ga boros
- Monitor usage โ cek spend mingguan, adjust model kalau over budget
๐ Hermes Config for Multi-Model โ
yaml
# hermes.config.yaml
models:
default: "gpt-4o-mini" # Harian, simple tasks
reasoning: "claude-3.5-sonnet" # Analysis, writing
classification: "gpt-4o-mini" # Routing, classify
# Hermes otomatis pilih model berdasarkan task complexity
# Atau bisa specify manual per task