Z.AI Launches GLM-4.7, Setting New SOTA in AI Models

Introduction

Recently, Z.AI launched its new model, GLM-4.7, which has set multiple new state-of-the-art (SOTA) benchmarks. It is recognized as the strongest coding model in China and has attracted significant attention from both technical and non-technical professionals.

GLM-4.7 has claimed the title of the strongest open-source model in the LM Arena’s WebDev leaderboard, surpassing GPT-5.2 and Claude-Sonnet-4.5.

Image 1: image

Additionally, it has topped the Hugging Face model leaderboard.

Image 2: image

On December 24, 2023, the Z.AI team held an AMA (Ask Me Anything) session on Reddit, addressing various questions from the community for over three hours, with more than 800 interactions.

Image 3: image

Key Highlights from the AMA

The Z.AI team provided insights on several key topics:

Information about Z.AI’s IPO
Plans for a dedicated programming model
The reasoning behind GLM-4.7’s logical consistency
Development of the model’s UI aesthetic capabilities
Release timeline for GLM-5 and upcoming products

Model Performance

One of the most discussed topics was the significant performance leap of GLM-4.7. The Z.AI team explained that they made critical adjustments during the post-training phase to enhance the model’s capabilities.

Image 4: image

They utilized a refined release recipe during the SFT (Supervised Fine-Tuning) and RL (Reinforcement Learning) phases:

Data from various sources was mixed in appropriate ratios, and contradictory data was removed.
When enhancing specific weaknesses, adjustments were made locally to avoid widespread impact.
The model was repeatedly validated through assessments to ensure comprehensive improvements.

The team also shared their entire pre-training data process:

Image 5: image

Data collection involved thorough cleaning, deduplication, and quality screening to eliminate noise.
Different domains followed specific rules for data selection.
The inclusion of data in training was based on empirical validation using smaller models to ensure stable positive gains.

This process significantly improved data effectiveness.

Programming Capabilities

When asked about GLM-4.7’s programming abilities, the Z.AI team clarified that it excels in real software engineering tasks and provides a solid experience in terminal use and Vibe Coding. In familiar environments with verifiable outcomes, such as bug detection and fixing in common projects, GLM-4.7 performs reliably. However, it may struggle with unfamiliar frameworks or entirely new functionalities due to limited exposure.

The team indicated that they plan to enhance the model’s front-end and back-end capabilities and improve stability in long-task, multi-step scenarios.

A key innovation in GLM-4.7’s reasoning mechanism is the introduction of Interleaved Thinking, Preserved Thinking, and Turn-level Thinking. Interleaved Thinking is described as an improved version of the thinking chain, where each step involves reasoning before action.

Image 6: image

Usage and Framework

The Z.AI team has invested significantly in optimizing and adapting GLM-4.7 for the Claude Code intelligent agent framework.

Image 7: image

GLM-4.7 demonstrates strong multilingual programming capabilities, maintaining robust understanding and processing abilities across various programming languages, including less common ones and complex engineering structures. The team emphasized that the intelligent agent framework could impact the final results by approximately 30%, leading to deeper refinements in critical areas like system prompts and tool invocation design.

Aesthetic Improvements

GLM-4.7’s aesthetic capabilities have also seen substantial enhancements, with a dedicated web development team focusing on front-end skills.

Image 8: image

They collected high-quality web design examples for training and integrated a visual language model (VLM) into their data pipeline, significantly improving UI aesthetics.

GLM-4.7 also offers better immersion in role-playing scenarios, balancing creative freedom with safety filtering.

Image 9: image

Future Plans

Beyond model performance, the future direction of the GLM series is a hot topic. In light of GPU resource constraints, concerns were raised about whether computational and memory costs might hinder model development.

The Z.AI team responded pragmatically, emphasizing the importance of training and deployment costs in model design. They aim to achieve peak performance within limited parameters while ensuring affordability and ease of deployment.

Image 10: image

Regarding version releases, the team hinted at the possibility of skipping versions 4.8 and 4.9 to focus on a more significant upgrade, with GLM-5 potentially on the way.

Image 11: image

Open Source Commitment

Z.AI has been well-received in the open-source community and recently introduced their reinforcement learning framework, Slime. This framework automates the reinforcement learning process, allowing models to continuously perform tasks and receive feedback for iterative training.

Image 12: image

The Z.AI team assured that their pursuit of AGI will not compromise their commitment to open-source initiatives, stating that both paths will be pursued simultaneously.

Image 13: image

Conclusion

In summary, Z.AI has showcased its capabilities with GLM-4.7, presenting not just a model version but a clearer roadmap for deploying models effectively in the real world. While the journey towards true AGI is challenging, the Z.AI team is committed to making substantial contributions along the way.