Integrating AI into Everyday Life
HONG KONG, April 12, 2025 /PRNewswire/ — SenseTime launched its newly upgraded large model series, SenseNova V6, at its Tech Day event held in several locations, including Shanghai and Shenzhen. Leveraging advances in the training of multimodal long chain-of-thought (CoT), global memory, and reinforcement learning, the model delivers industry-leading multimodal reasoning capabilities while setting a new benchmark for cost efficiency.
The capabilities of the SenseNova V6 model have been greatly enhanced, with strong advantages in long CoT, reasoning, mathematical capabilities, and global memory. Its multimodal reasoning capabilities ranked first in China when benchmarked against GPT-o1, while its data analysis performance outpaced GPT-4o. It also combines high performance with cost efficiency. Its multimodal training efficiency is aligned with that of language models, providing the lowest training costs in the industry. Its reasoning costs are also the lowest in the industry. The new lightweight full-modal interactive model, SenseNova V6 Omni, delivers the most advanced multimodal interactive capabilities in China. It is China’s first large model that supports in-depth analysis of 10-minute mid-to-long form videos, benchmarked against Gemini 2.5 Turbo to be among the strongest in its class.
Dr. Xu Li, Chairman of the Board and CEO of SenseTime, said, "AI’s true purpose is found in our everyday lives. SenseNova V6 has pushed past the boundaries of multimodality, unlocking infinite possibilities in reasoning and intelligence."
Multimodal long-chain reasoning, reinforcement learning, and global memory: SenseNova V6 leads the way in enabling multimodal deep thinking
As a native Mixture of Experts (MoE)-based multimodal general foundation model with over 600 billion parameters, SenseNova V6 has achieved multiple technological breakthroughs. A single model is able to perform a range of tasks across text and multimodal domains, including:
- Long CoT: Trained on over 200B high-quality multimodal long CoT data, with the longest CoT reaching 64K;
- Mathematical Capabilities: Significantly outperformed GPT-4o in data analysis capabilities;
- Reasoning Capabilities: Ranked first in China for multimodal deep reasoning, benchmarked against GPT-o1;
- Global Memory: First in China to achieve long-form video understanding, supporting content of 10 minutes in length for comprehension and deep reasoning.
In leading benchmark evaluations of reasoning and multimodal capabilities, SenseNova V6 achieved state-of-the-art results across multiple metrics.
Key indicators: SenseNova V6 demonstrated strong overall performance in language tasks, on par with leading international models. It excelled in multimodal capabilities, with outstanding results in all aspects. Both its language reasoning and multimodal reasoning capabilities are benchmarked against leading international models such as GPT-4.5 and Gemini 2.0 Pro. Strong reasoning capabilities: From SenseNova 5.5 to V6/V6 Reasoner, the SenseNova unified model demonstrated significant improvements
Based on more than 200B of high-quality multimodal long CoT data, SenseTime leverages multi-agent collaboration to synthesize and verify long CoT. SenseNova V6 has developed exceptional multimodal reasoning capabilities, supporting multimodal long CoTs up to 64K tokens, enabling the model’s long-term thinking capability.
In solving complex real-world problems, SenseNova V6 utilizes its robust hybrid image and text understanding and reasoning capabilities to help users with a range of tasks.
For complex document processing scenarios, SenseNova V6 is able to help users with difficult tasks through its strong multimodal reasoning capabilities. For example, in insurance claims processing, SenseNova V6 can assess whether the submitted commercial health insurance claims meet the requirements. It can detect issues such as unnecessary prescriptions and examinations, missing documents, or incomplete submissions.
Leveraging breakthroughs in multimodal reinforcement learning, SenseTime has developed a hybrid reinforcement learning framework for various image-text tasks, based on different difficulty levels and multi-reward models.
China’s first model to break the 10-minute barrier in video understanding, achieving analysis of extended content within seconds
With its global memory capability, SenseNova V6 overcomes the limitations of traditional models that could only support short videos, and now supports full-framerate analysis of 10-minute videos.
With advanced comprehension capabilities, SenseNova V6 is also able to intelligently edit and extract video highlights, helping users to retain memorable moments.
SenseTime’s proprietary technology aligns visual information (images), auditory information (speech and sounds), linguistic information (subtitles and spoken language), and temporal logic to form a multimodal unified sequential representation. Based on this framework, it applies fine-grained cascading compression and content-aware dynamic filtering to achieve high-ratio compression of long videos. A 10-minute video can be compressed into 16K tokens while retaining key semantics.
Human-like interaction: SenseNova V6 Omni launches with multi-industry deployment
With the launch of SenseNova V6, SenseNova’s has upgraded its real-time interactive unified large model to SenseNova V6 Omni, with deep optimizations across scenarios, including role-playing, translation and reading, cultural tourism guiding, picture book narration, and mathematical explanation.
In translation and reading scenarios, SenseNova V6 Omni enables users to achieve precise spatial interactions with a simple finger gesture. The model also accurately understands the relationship between local and global information, providing a more intuitive and human-like interactive experience.
SenseNova V6 Omni features more human-like perceptual and expressive abilities, as well as emotional understanding. It has been deployed across multiple industries and scenarios, including embodied intelligence, becoming the first commercialized full-modality real-time interactive model in China.
Full-featured version of SenseChat launched, now available for preview
SenseTime has released a comprehensive update to SenseChat, along with a brand-new app built on the complete capabilities of SenseNova V6. Through a single access point, users can engage in seamless multimodal interactive streaming experiences across text, images, and video.
The SenseChat app is available for preview and SenseNova V6 is now available for trial via the SenseChat web platform at https://chat.sensetime.com/wb/chat.
RMB100 million in vouchers released to accelerate full-stack scenario implementation
SenseTime also announced a dedicated subsidy of RMB100 million, aimed at advancing emerging fields such as embodied intelligence and AIGC. Through targeted and multi-dimensional initiatives, SenseTime is delivering a one-stop solution designed for high efficiency, low cost, and end-to-end AI implementation, spanning expert consulting, model training, and reasoning validation.
– End –
About SenseTime
SenseTime is a leading AI software company focused on creating a better AI-empowered future through innovation. We are committed to advancing the state of the art in AI research, developing scalable and affordable AI software platforms that benefit businesses, people and society as a whole, while attracting and nurturing top talents to shape the future together.
With our roots in the academic world, we invest in our original and cutting-edge research that allows us to offer and continuously improve industry-leading AI capabilities in universal multimodal and multi-task models, covering key fields across perception intelligence, natural language processing, decision intelligence, AI-enabled content generation, as well as key capabilities in AI chips, sensors and computing infrastructure. Our proprietary AI infrastructure, SenseCore, integrates computing power, algorithms, and platforms, enabling us to build the "SenseNova" foundation model sets and R&D system that unlocks the ability to perform general AI tasks at low cost and with high efficiency. Our technologies are trusted by customers and partners in many industry verticals including Generative AI, Computer Vision and Smart Auto.
SenseTime has been actively involved in the development of national and international industry standards on data security, privacy protection, ethical and sustainable AI, working closely with multiple domestic and multilateral institutions on ethical and sustainable AI development. SenseTime was the only AI company in Asia to have its Code of Ethics for AI Sustainable Development selected by the United Nations as one of the key publication references in the United Nations Resource Guide on AI Strategies, and was published in June 2021.
SenseTime Group Inc. has successfully listed on the Main Board of the Stock Exchange of Hong Kong Limited (HKEX). We have offices in markets including Hong Kong, Shanghai, Beijing, Shenzhen, Chengdu, Hangzhou, Nanping, Qingdao, Xi’an, Macau, Kyoto, Tokyo, Singapore, Riyadh, Abu Dhabi, Dubai, Kuala Lumpur and South Korea, etc., as well as presence in Germany, Thailand, Indonesia and the Philippines. For more information, please visit SenseTime’s official website or LinkedIn, X, Facebook and Youtube pages.