# OpenRouter

Published 2025-05-06

# 1. Llama3 70B Instruct:

  • A 70B parameter instruction-tuned model from Meta Llama 3 series, optimized for dialogue scenarios, demonstrating excellent performance across multiple industry benchmarks.

# 2. Llama3 8B Instruct:

  • The 8B parameter version of the Llama 3 series, suitable for resource-constrained environments, supporting instruction tuning and appropriate for lightweight applications.

# 3. Llama3.1 405B:

  • An ultra-large model in the Llama 3.1 series with 405B parameters, supporting multilingual and long context processing, suitable for complex tasks.

# 4. Llama2 70B Chat:

  • A 70B parameter dialogue-optimized model from Meta Llama 2 series, specifically designed for chat applications, enhancing conversation quality and safety.

# 5. Llama Guard 3 8B:

  • An 8B parameter model from the Llama 3.1 series, focused on content safety classification, usable for input and output content moderation.

# 6. Mistral Large:

  • A large language model provided by Mistral AI with undisclosed specific parameters, emphasizing high performance and throughput, suitable for commercial-grade deployment.

# 7. Mixtral 8x22B:

  • Mistral AI Mixtral model, utilizing a sparse mixture of experts (SMoE) architecture with 141B total parameters, activating 39B parameters per inference, achieving efficient inference.

# 8. Codestral 2501:

  • Mistral AI code generation model with optimized architecture and tokenizer, increasing code generation speed by approximately twofold, excelling in Fill-in-the-Middle (FIM) tasks.

# 9. Mistral 7B Instruct:

  • An instruction-tuned version of Mistral 7B, supporting a 32K context window, suitable for tasks requiring long context processing.

# 10. DeepSeek V3:

  • DeepSeek flagship model employing a mixture of experts architecture with 671B total parameters, activating 37B parameters per inference, supporting ultra-long context processing, suitable for complex tasks.

# 11. DeepSeek R1:

  • DeepSeek R1 model, focusing on mathematics, code, and complex reasoning tasks, trained using reinforcement learning, open-source and cost-effective.

# 12. Qwen2.5 72B Instruct:

  • A 72B parameter instruction-tuned model from the Tongyi Qianwen series, suitable for complex instruction execution and multi-domain applications.

# 13. Qwen-Turbo/Plus/Max:

  • Three variants of the Tongyi Qianwen series: Turbo emphasizes speed, Plus provides balanced performance, and Max supports longer context and complex tasks, catering to different needs.

# 14. Gemini Pro 1.0:

  • Google DeepMind Gemini Pro model, supporting multimodal input, featuring powerful reasoning and coding capabilities, suitable for complex tasks.

# 15. Gemma 2 27B:

  • An open-source model released by Google with 27B parameters, aimed at providing high-performance language models for developers and researchers.

# 16. Command R+:

  • An enhanced version of the Command series model, emphasizing command execution, task planning, and multi-step reasoning, suitable for enterprise automation applications.

# 17. Command R:

  • The base version of the Command series model, used for command execution and simple reasoning, suitable for general tasks and low-latency requirement scenarios.

# 18. GPT-4/GPT-4 Turbo:

  • OpenAI GPT-4 series models, supporting multimodal input, with GPT-4 Turbo being an optimized version featuring lower latency and larger context windows.

# 19. GPT-3.5 Turbo:

  • A high-speed, economical version in OpenAI GPT-3.5 series, primarily used for real-time conversation and simple tasks.

# 20. Claude v2.1/v2:

  • Models from Anthropic Claude series, focusing on safety and output alignment, providing a gentle, polite conversational experience.

# 21. Grok 3 Beta:

  • xAI Grok 3 Beta model, in the testing phase, known for its unique humor and personalized responses, possessing certain image generation capabilities.

# 22. TheDrummer: Anubis Pro 105B:

  • A large model with 105B parameters, positioned as a professional version, suitable for complex and high-precision large-scale tasks.

# 23. Goliath 120B:

  • An ultra-large-scale model with 120B parameters, providing top-tier general language understanding and generation capabilities, suitable for enterprise-level and high-demand applications.

# 24. Llama3.3 Euryale 70B:

  • A 70B parameter model in the Llama 3.3 series, further optimized to enhance reasoning capabilities and conversation quality.

# 25. NeverSleep: Lumimaid v0.2 70B:

  • NeverSleep's Lumimaid v0.2 model with 70B parameters, focusing on continuous dialogue and multi-round interaction, suitable for long-duration conversation tasks.

# 26. Nous: Hermes 3 405B Instruct:

  • An instruction-tuned 405B parameter model from Nous Research's Hermes 3 series, emphasizing high-quality instruction following and multilingual support, suitable for complex tasks.