Qwen 3 Embedding 4B by Inference | AI model information

Discuss

Model details

Qwen 3 Embedding 4B

Inferenceqwen/qwen3-embedding-4bqwen

Open provider page Provider docs

Quick Info

Provider: Inference
Model key: qwen/qwen3-embedding-4b
Release date: Jan 1, 2025

Cost

A provider subscription or plan supersedes token-based pricing for this model.

Limits

Output tokens: 2,048 tokens
Context window: 32,000 tokens

Latest news about Qwen 3 Embedding 4B

No articles yet. Fetch the latest news to show it here.

Videos about Qwen 3 Embedding 4B

Recent tweets and retweets from Inference

Jul 17, 2026, 5:43 PMUTC

5 things @samhogan does differently > signs company documents from the terminal > trained 5 function-specific @inference_net agents > keeps his entire to-do list in one apple note (over a year old) > moved his entire team to @opencode for llm portability > blocks twitter…

Jul 11, 2026, 8:42 PMUTC

We're releasing Inference AutoTune Distill any frontier model into a 1-30B parameter task-specific SLM with only 25 lines of code automatically route requests to reduce cost and latency by >90% ~2 hours and <$250 to train. You own the weights Available in private beta today

Jul 9, 2026, 1:25 AMUTC

we're helping a customer spending $60k/mo move from OpenAI & Anthropic to open source models they use almost every model offered by the labs, so we needed to find replacements for all of them after generating evals, this is what we landed on new cost: $12k/mo, 80% savings

Jun 29, 2026, 4:25 PMUTC

I'll be on @MTSlive today at 4 pm discussing GLM 5.2 adoption, what this means for frontier labs, and The Shape of Inference in a post-GLM 5.2 world

Jun 28, 2026, 10:58 PMUTC

Execs at Google are probably calling the US government right now and begging them to withhold the next Gemini release