AliCloud Releases Qwen 2 Open Source Model which the Performance Beats All Current Open Source Models and Domestic Closed Source Models

Overview

Brain Titan
4 min readJun 9, 2024

Alibaba Cloud has unveiled its groundbreaking Qwen 2 open-source large language model (LLM), surpassing all existing open-source and domestic closed-source models in terms of performance. Qwen 2 boasts a comprehensive range of capabilities, including exceptional natural language processing (NLP) abilities, robust coding and mathematical skills, and impressive multilingual proficiency.

Key Features

  • Model Sizes: Qwen 2 encompasses a diverse range of model sizes, catering to varied needs and applications. These include:
  • Qwen2–0.5B: Ideal for resource-constrained devices and tasks with low computational demands
  • Qwen2–1.5B: Suitable for medium-scale tasks requiring greater complexity than Qwen2–0.5B
  • Qwen2–7B: A mid-large model well-suited for tasks demanding robust language processing capabilities
  • Qwen2–57B-A14B: A large-scale model designed for tasks requiring high performance and precision
  • Qwen2–72B: The series’ largest model, tailored for handling highly complex tasks and extensive datasets
  • GQA Technology: All Qwen 2 models leverage Generalized Query Attention (GQA) technology, significantly boosting inference speed and reducing memory consumption.
  • Context Length: Qwen 2 models are trained on a massive 32K-token dataset, enabling them to process lengthy texts effectively. Additionally, they demonstrate remarkable performance when handling 128K-token contexts.
  • Instruction-Fine-Tuned Models: Qwen 2 offers a suite of instruction-fine-tuned models specifically optimized for particular tasks. These include Qwen2–7B-Instruct and Qwen2–72B-Instruct, capable of processing contexts reaching 128K tokens.
  • Multilingual Capabilities: Qwen 2 models are optimized for 27 languages, exhibiting significant enhancements in multilingual processing, particularly beyond Chinese and English.
  • Shared Parameter Technique: Smaller models employ tie embedding technology, sharing parameters between input and output layers, increasing non-embedding parameter占比, enhancing model performance within the same size constraints.
https://interserver.awesome-vps.com

Benchmark Results

Qwen 2 models have consistently outperformed their counterparts in various benchmarks, showcasing exceptional performance across a wide spectrum of tasks. Notable highlights include:

  • Natural Language Processing: Qwen2–72B surpasses current leading open-source models, including Llama-3–70B and Qwen1.5–110B, in natural language understanding, knowledge mastery, code generation, mathematical capabilities, and multilingual processing.
  • Coding and Mathematical Skills: Qwen 2 models exhibit remarkable proficiency in various programming languages, particularly excelling in code generation and comprehension tasks. Qwen2–72B-Instruct, trained on a vast dataset of high-quality mathematical data, demonstrates exceptional problem-solving abilities.
  • Long Text Processing: Qwen 2 models excel in handling long texts. All Instruct models are trained on 32K-token contexts and can be extended to even longer lengths using techniques like YARN or Dual Chunk Attention. Qwen2–72B-Instruct demonstrates consistent performance on contexts reaching 128K tokens, while Qwen2–7B-Instruct and Qwen2–57B-A14B-Instruct can effectively process 128K and 64K tokens, respectively.
  • Multilingual Processing: Qwen 2 models showcase outstanding multilingual capabilities, particularly in 27 languages beyond Chinese and English. Language switching instances are significantly reduced, enabling smoother handling of multilingual content.
  • Safety: Qwen2–72B-Instruct matches GPT-4’s performance in handling multilingual unsafe queries, including illegal activities, fraud, pornography, and privacy violations. It outperforms Mixtral-8x22B by a considerable margin. Evaluated using data from Jailbreak and translated into multiple languages, Qwen 2 models demonstrate robust safety measures.

Detailed Benchmark Results

  • Natural Language Understanding: Qwen2–72B excels in natural language understanding benchmarks, surpassing other leading models like Llama-3–70B.
  • Knowledge Mastery: In knowledge assessment tasks, Qwen2–72B demonstrates remarkable proficiency in answering complex questions.
  • Code Generation: Qwen 2 models shine in code generation tasks, particularly in multilingual programming environments.
  • Mathematical Capabilities: Qwen2–72B-Instruct exhibits exceptional mathematical problem-solving abilities due to its training on a vast dataset of high-quality mathematical data.
  • Multilingual Processing: Optimized for 27 languages, Qwen 2 models exhibit reduced language switching and enhanced overall multilingual handling.
  • Long Text Processing: Qwen2–72B-Instruct and other models consistently perform well on long text tasks, effectively processing extended contexts.

Availability

Qwen 2 models are readily available as open-source on Hugging Face and ModelScope. Please refer to the respective model cards for detailed usage instructions.

Download: https://modelscope.cn/organization/qwen

Github: https://github.com/QwenLM/Qwen2

Try online: https://huggingface.co/spaces/Qwen/Qwen2-72B-Instruct

Official: https://qwenlm.github.io/blog/qwen2/

https://interserver.awesome-vps.com

--

--