2025
This document details the version iterations of multiple core components within the ONE platform for the year 2025, focusing on three key dimensions: New Features, Functional Optimizations, and Bug Fixes.
The components involved include ONE_ALERT, ONE_ANALYSIS, ONE_LOG, ONE_EVENT, ONE_APM, ONE_ETL, ONE_STM, ONE_RUM, ONE_IAM, ONE_CMDB, ONE_TAG, ONE_Pilot, ONE_CHAT, covering critical areas such as alert configuration management, data visualization analysis, log collection and parsing, event tracking and correlation, application performance monitoring, multi-source data ETL, and digital experience monitoring.
The iterations focus on enhancing product feature completeness, operational ease of use, and runtime stability, while also adapting to new scenarios like multi-region multi-data center deployment and international language environments, helping users achieve full-link observability and management under complex operations.
The following table provides the Chinese-English product component names and their functional descriptions:
| Product Component | Product Module | Functional Description |
|---|---|---|
| ONE_ALERT | Intelligent Alerting | Manages platform alert rule configuration, alert generation, alert notification, and alert status management (e.g., manual closure, batch processing). Supports alert data filtering, permission isolation, and cross-resource domain alert handling, helping operators promptly detect and respond to system anomalies. |
| ONE_ANALYSIS | Data Analysis (Data Reporting) | Provides data visualization and analysis capabilities such as dashboard configuration, data cube construction, and automated report generation. Supports custom chart display, multi-dimensional data filtering, and report email delivery, helping users intuitively grasp system operation data and business trends. |
| ONE_LOG | Log Analysis | Implements log collection (including deep monitoring process logs, third-party log ingestion), log query, index management, and data processing (e.g., JSON parsing, grok rule processing). Supports filtering logs by service and data center, assisting problem investigation and log auditing. |
| ONE_EVENT | Event Management | Manages event data collection, filtering, display, and correlation analysis. Supports filtering by data center and event attributes, standardizes instance field display, enables event-alert correlation and drill-down, assisting in tracking various system events (e.g., deployment, fault events). |
| ONE_APM | Application Performance Monitoring (System Services) | Focuses on application performance observability, covering service identification, trace tracking, process monitoring, container and K8s resource monitoring, and probe management. Supports trace detail analysis and performance metric statistics, assisting in locating application performance bottlenecks and service dependencies. |
| ONE_ETL | Data Integration | Implements ingestion, processing, and storage of multi-source data (e.g., Alibaba Cloud service metrics, SkyWalking trace data, SNMP protocol data). Supports data flow topology editing and data localization storage, adapting to multi-region multi-data center data architectures. |
| ONE_STM | Availability Monitoring (Monitoring Tasks) | Manages the creation and management of monitoring points and tasks. Supports organizing monitoring resources by tags, records search criteria and quick filter configurations, enabling continuous monitoring of system key nodes to ensure business continuity. |
| ONE_RUM | Digital Experience | Conducts performance and experience monitoring for terminal applications (Web, Douyin Mini Program, HarmonyOS applications, etc.), including page load speed, JS error statistics, user session analysis, optimizing end-user experience. |
| ONE_IAM | Identity and Access Management | Manages platform user identity, permissions, License, and security settings. Supports user list filtering, custom headers, multi-language adaptation, and login verification (e.g., SMS verification code). Enables centralized permission management from the primary center, ensuring platform access security. |
| ONE_CMDB | Configuration Management (Data Model) | Stores configuration information and lifecycle status of system resource instances (hosts, services, containers, etc.). Supports resource tag management, instance relationship maintenance, and unified management of multi-data center instances, providing foundational resource data support for other components. |
| ONE_TAG | Tag Management | Provides tag creation, batch association/removal, and automatic propagation functions. Supports filtering resource instances by tags, enabling resource permission control and classification management, adapting to multi-region multi-data center resource tag synchronization needs. |
| ONE_Pilot | ONE Platform Monitoring & Ops (SkyEye Platform) | Provides global topology view of ONE platform components, Nginx container traffic analysis (inbound/outbound traffic, request count statistics), manages Kafka component OOM monitoring alert policies and ClickHouse data storage periods, assisting overall ONE platform operation and traffic control. |
| ONE_CHAT-SERVICE/ONE_CHAT | XiaoRui Assistant | Integrates the DeepSeek-R1 large language model, providing intelligent Q&A services for platform usage instructions, operational knowledge queries, etc. Supports generating PQL query statements, enhancing operator problem-solving efficiency. |
| ONE_BPI/ONE_CORE-LINK | Observability Insights (Multi-dimensional Observability) | Provides core link topology orchestration and panoramic observability capabilities. Supports custom service call relationships, combining monitoring, log, and trace data to achieve cross-service problem localization and fault investigation, aiding efficient operational decision-making. |
ONE 3.9.0.0——2025/12/15
【APM】
New Features
- Added system grouping management capability, supporting 3-level system group configuration for more granular resource control.
- The data collection module now supports configuring Profiling custom collection rules to meet personalized data collection needs.
New Optimizations
- Global Topology Experience Optimization: Enhanced filter box interaction logic and added display of inter-system service call relationships, making the topology view more comprehensive.
- Service Fusion Scenario Upgrade: Added vertical architecture display and view addition functionality on the service details page for more flexible scenario adaptation.
- Unified Alert Query Logic: Standardized operations like viewing and sorting alerts on entity lists and entity detail pages for a more consistent query experience.
- Enhanced Name Mapping Capability: Service and process group names now support dictionary mapping, and advanced mapping options are added to name configuration to adapt to more naming scenarios.
- Indicator System Optimization: Adjusted service instance heap/non-heap metric names and categorized connection pool metrics under the Weblogic metric category for clearer classification.
- Flexible System Model Orchestration: Supports editing system attribute fields to adapt to dynamic business scenario adjustments.
- Process Group Naming Rule Optimization: Added host attribute and label filtering schemes; custom naming rules now support pagination and sorting for more efficient rule management.
【CORE-LINK】
New Features
- Entry interface topology supports automated configuration.
- Cards differentiate user permissions; queried data is segregated by resource domain.
- Added card sharing permissions.
- Key metrics adapted to the November 15 indicator system overhaul.
- Core link sorting functionality revamped.
New Optimizations
- When defining interfaces between nodes after creating a group, interface names under multiple services within the group might be duplicated and indistinguishable; service names are now added for distinction.
- Changed the cloning naming method for core links to a more intuitive format:
[Clone]+ original core link name. - Vertical topology now displays the technology type for service entities and service instance entities.
- The link view page supports cloning links.
- Duplicate name validation is performed immediately after entering a name during editing, not only upon saving.
- Added name editing/modification on the list page.
- Core link service page now allows jumping to the service page.
- On the core link list, clicking "Clone" enters the clone page; clicking "Back" returns to the list, not the cloned link's details.
- Optimized the expanded style of vertical topology.
- When two nodes are close, hover-over line indicators are hard to read; added white background to indicator information.
- Optimized operational workflows.
- Optimized cloning logic.
- VM-type metrics are temporarily filtered out when configuring key metrics for core links.
【ALERT】
New Features
- Added AI adaptive threshold detection algorithm, supporting alert rule configuration for alarm generation, suitable for real-time monitoring of large-scale metrics.
New Optimizations
- Optimized alert name format: For multi-metric alerts with the same grouping, the grouping value is reported only once in the name.
- In alert names, a comma now separates the current value and the threshold, improving readability.
- Alert IDs and problem IDs are now displayed as 8-digit hexadecimal aligned strings.
- Separated similarity convergence and field-identical convergence. Enumerable fields are not suitable for field similarity convergence.
- Notification method validation now returns specific failure reasons; all methods are validated simultaneously, and reasons are provided for each.
- For entity attributes like "belongs to XX" (instance ID, except host ID), notification content provides the
fullname; if unavailable, the instance alias is used. - Adjusted tooltips in the status column of the alert list, adding a prompt for "no status."
- Processed internationalization entries and modified translations for proper nouns.
【AI】
New Features
- Added AI adaptive threshold detection method, where AI performs statistical calculations on metrics to provide upper/lower boundary thresholds.
- Added AI intelligent summarization, supporting specific data ranges. Asking real data-related questions in XiaoRui triggers answers.
New Optimizations
- Added LLM root cause analysis tool, including a database connection pool component.
- Expanded LLM root cause analysis scenarios, adding a knowledge base for database metric root cause analysis.
- Added "Regenerate," "Like/Dislike" feedback, and "Copy Root Cause Conclusion" functions on the LLM root cause analysis results page.
- Added export functionality for LLM root cause analysis results; the entire analysis page can be exported.
- Added download and copy functions for LLM root cause analysis flowchart images.
- Added a time range description in the top right corner of the LLM root cause analysis details page, indicating the investigation timeframe.
- Comprehensively reviewed the LLM root cause analysis knowledge base; added an English knowledge base to facilitate LLM understanding.
- Optimized frontend and backend for different LLM root cause analysis scenarios, e.g., skipping slow trace queries for error scenarios.
【RUM】
New Features
- Supports configuring whether to collect iOS widget crashes.
- HarmonyOS NEXT lag definition supports custom main thread configuration.
- Android, iOS, HarmonyOS NEXT: User session replay function can be enabled via an API interface, allowing users to invoke the API to start replay in specific scenarios.
- Digital Experience / Code Exceptions: Supports batch tagging, modifying fix status, and adding comments.
- Android, iOS, HarmonyOS NEXT, WeChat Mini Program, Web, Douyin Mini Program: Request tracing (coloring) supports end-to-end collection probability configuration via Nacos, distributed to SDKs, compliant with OT standard collection.
- Installation & Deployment / Client SDK: Added viewing update logs, dynamically fetching version update notes from the IAM backend.
- Network request identification rules: Maximum number changed to 50 rules. Added validation during rule creation/editing: unique URL parameter keys, GET parameter keys, and POST parameter keys across all rules must not exceed 64 each.
- Added an API for the number of monitored applications.
- Supports a toggle to display a prompt on all file upload locations indicating the platform is non-confidential and confidential files should not be uploaded.
New Optimizations
- Optimized health score text descriptions.
- Search Center / Network Requests: The Trace ID column in the request list now supports click-to-jump.
Bug Fixes
- Fixed the issue where dashboard grouping by application version caused incorrect filter conditions when jumping to the Search Center.
- Fixed the network request identification rule issue: when multiple rules are configured, only the first rule matched correctly; subsequent rules failed to trigger.
- Fixed the network request identification rule condition issue: parameter placeholders like
"value1,value2"were incorrectly treated as a single value in the code.
【IAM】
New Optimizations
- Supports administrator customization of the effect for locking users after multiple consecutive incorrect password attempts during login.
- Key text-based columns in the Access Control page list now support custom sorting.
- Environment and resource domain options in the platform environment/resource domain switch function now support user-defined sorting.
- Developed security compliance requirements.
- Added display of user's real name alongside the account name.
【ANALYSIS】
New Features
- Query variables now support two forms: selection query and dimension query, allowing query results or dimension information to be used as variable values.
- Automatic report configuration now supports selecting a resource domain; reports are sent based on data from the specified domain.
- When generating an online share link for a dashboard, current filter conditions can be included; opening the link automatically populates filter and variable conditions.
New Optimizations
- Automatic reports are now sent with all groups expanded by default to ensure data completeness.
- Linkage interaction now supports linking to trace cards, allowing trace ID passing for linked queries.
- Global topology cards adapted to APM-side adjustments, supporting cascading selection for systems, services, and databases.
- When switching indicator systems causes historically configured metrics to become unavailable, a prompt is shown below the display box.
【LOG】
New Features
- Supports user-defined configuration of multi-line log splitting rules. Multi-line logs can be split based on timestamp information and string patterns, allowing each split log entry to be analyzed separately.
- Container standard output log paths are now also displayed in the collection list.
New Optimizations
- Other business parties calling the log list API can now pass multiple instance IDs to query logs from multiple instances.
- Optimized export file name formats for list/chart mode exports in the log query section.
- Fixed a bug where uploading files in file management triggered multiple requests upon saving.
- Optimized response results for certain error scenarios in OpenAPI (e.g., non-existent field values, illegal field values) to better indicate which field caused the error.
- Added required field descriptions for
modelkeyandattributekeywhen querying entity attributes using thequeryfilterfield in OpenAPI. - When switching between Chinese and English in the log configured metrics list, the metric categories, dimensions, and display units also switch accordingly.
【EVENT】
New Optimizations
- After selecting a snapshot, users can now clear it directly without needing to click "Back to Event Query."
- Quick filter fields now support switching between chart and list modes.
- Added log information to export file names for event list and chart mode exports.
- The order of the grouping dropdown when adding quick filters now matches the left-side list.
- After deleting a snapshot, search box conditions are also cleared simultaneously.
- All Event Center APIs now perform permission checks on user operations to prevent lower-privilege users from performing unauthorized actions.
【ETL】
New Features
- Added data integration for VMWare Hypervisor metrics and event collection.
New Optimizations
- Optimized data flow import usage.
- Refactored SmartGate configuration distribution logic, fixing the issue of orphaned components existing in the platform.
【CMDB】
New Optimizations
- The indicator list now supports displaying and filtering/searching by extended attributes.
- Optimized English translations for the indicator system.
ONE 3.8.0.0——2025/11/15
【APM】
New Features
- SQL Execution Plan: Supports filtering by service and configuring SQL rules to obtain SQL execution plans.
- Error Detail Page Export: Supports exporting log content to the local machine during export.
- License Billing Function Adjustment: Supports fine-grained management of billed quota items.
New Optimizations
- Optimized the interaction for deploying ETL tasks and probe collection configurations.
- Core Link Optimization: When an instance is deleted or taken offline, its status within links is synchronized and updated for display.
- Optimized the display method in the trace detail callMap: aggregation and expansion are now based on entities.
【RUM】
New Features
- New Page Analysis Entry: Added a "Page Analysis" module within Digital Experience, supporting one-click viewing of page load performance, JS error rate, and other metrics to help accurately locate page-level issues.
- Page Load Details: Enhanced with associated data including user operations, network requests, and issues.
- User Operation Details: Enhanced with associated data including network requests and issues.
- Application Launch Details: Enhanced with associated data including network requests and issues.
- Digital Experience Module Interaction Optimization: The "Application" filter is now pinned to the top of the Digital Experience menu. After selecting an application, only menu tabs relevant to that application are displayed (e.g., the Crash tab is hidden for Web applications under Code Exceptions), reducing information clutter and improving operational efficiency.
- Added iOS Extension Widget Observability: Supports collecting and analyzing widget crash data.
- Native Crash Deep Parsing: Supports analysis of native crashes for Android and HarmonyOS NEXT, automatically parsing so library stacks to quickly locate the root cause.
- License Fine-Grained Management: Supports allocating "Monthly Active User" licenses by application. Independent quotas can be set for different applications on the platform, adapting to resource management needs in multi-application scenarios.
- Full Lifecycle Monitoring for Network Requests: Added collection of "Pending state" network requests for Android, iOS, and Web applications.
- Phase Two API Capabilities Released: Supports interfaces for querying application health scores, crash trends, JS error trends, lag trends, network request trends, etc.
New Optimizations
- In the Digital Experience > Code Exceptions > JS Errors module, when grouping by JS file, a custom table header field "Associated Page Count" has been added to visually show the number of pages where errors from that file occur.
- In the service relationship diagram on the application details page, the "Service Identification Name" on service cards has been changed to display the "Service Name," aligning with service dependency analysis from a business perspective.
- In network request identification rules, "GET Parameters" text and collection values have been changed to "URL Parameters," and "POST Parameters" text and collection values have been changed to "Body Parameters."
- RUM custom metrics have been adapted to the new indicator system specification.
Bug Fixes
- Fixed unauthorized access issues: Enhanced role permission verification to restrict non-authorized users from accessing sensitive data (e.g., crash stacks, user behavior details).
- Optimized Terminal Application Health Score: Adjusted weight factors (crash rate and startup time weight increased to 30%) and added a "User Retention Correlation" metric, making the score more reflective of actual experience.
【CHAT-SERVICE】
New Optimizations
- XiaoRui's knowledge base and redirects have been switched to the new documentation center.
Bug Fixes
- Fixed a bug where some fields in AI similarity convergence were not taking effect.
【ALERT】
New Features
- Webhook and email notification methods now support custom notification templates, similar to existing methods, for more flexible use.
- Added Syslog notification method: Allows entering target server address, network protocol type, port, and customizing notification content.
- Adapted the indicator selector to the new indicator system. Default groupings use interfaces provided by CMDB for display.
New Optimizations
- Adjusted permission menus: Notification records, response policies, and suppression policies have been split into separate pages.
- Added explanatory notes next to script functions in private deployment environments, indicating how to delete them if needed.
- Optimized the secondary confirmation pop-up text when deleting alarm rule template libraries, clarifying that if templates have been distributed, they need to be deleted in each resource domain separately.
- Alarm names in English format now correctly display units.
- Non-critical instances cannot be queried via filters in data exploration; the indicator analysis button next to alarms for such instances has been hidden.
Bug Fixes
- Fixed an alarm generation bug for availability metrics where the default grouping was monitoring tasks and instance IDs.
- Fixed a bug where the object field for alarms grouped by disk & disk partition was not displayed correctly.
- Fixed a bug where default notification template content could not switch correctly when toggling between Chinese and English.
- Fixed a bug where indicators in alarm rules could not be displayed normally after switching between Chinese and English.
- Fixed a bug in notification templates where "Belonging Host" and "Belonging Instance" were incorrectly displaying host IP and instance aliases, respectively.
- Entity attributes related to system classification are now hidden in the filter boxes within the Alert module to avoid confusion.
【ANALYSIS】
New Optimizations
- Dashboard indicator selection box adapted to the new indicator system, allowing indicator selection according to the current system's classification.
- General filters and single-item filters adapted to the indicator system.
- Data linkage and jump interactions adapted to the indicator system.
- Compatibility for historically configured indicators not under the current indicator system: they can be queried normally but show an anomaly effect during configuration.
- OpenAPI now supports querying RUM application health scores.
- OpenAPI supports an interface for pulling and downloading automatic report files.
- When a card has multiple data item configurations for linkage interaction, it automatically matches all configured conditions.
【LOG】
New Features
- Added log metrics adapted to the indicator system. Created metrics are placed under fixed categories of preset metrics.
【EVENT】
New Optimizations
- Event details now display all information of related entities and allow adding quick filters or viewing the related entities.
【CMDB】
New Features
- Indicator system refactored to support custom indicator systems. Systems take effect based on configurable resource domains, allowing users to control indicator usage permissions according to business scenarios.
【ETL】
New Features
- Adapted to the new version of the indicator system.
New Optimizations
- Optimized collection configurations for Oracle, DMDB, RabbitMQ, and ClickHouse cards.
- Optimized feedback for data processing error information.
ONE 3.7.0.0——2025/10/30
【ANALYSIS】
New Features
- List cards now support setting a default pagination count.
- Global time changes in the dashboard edit state are now also saved as part of the configuration.
- Dashboard performance optimization; loading speed is improved when multiple cards are present.
- PromQL now supports AI-assisted writing, intelligently generating PromQL statements and filling them into the input box with one click.
- API queries now support metrics query, PQL query, and health score query interfaces.
- View component configuration adjusted; the core link module referencing views can now directly select a dashboard as the current view.
- Supports grouping of custom exception event additional information for RUM.
- Added a call chain scenario card, displaying call chain details and supporting call tree and call map views.
- Localized tech ecosystem adaptation, security improvements, and vulnerability fixes.
【LOG】
New Features
- External indices now support correspondence with ELK; ELK indices can be created, and log metadata can be mapped to ELK indices for querying.
- The log query filter box now supports multiple keyword queries; the relationship between keywords is 'AND'.
- OpenAPI now supports log list and log statistics interfaces.
- Localized tech ecosystem adaptation, security improvements, and vulnerability fixes.
【EVENT】
New Features
- Non-detection events from system detection sources no longer generate raw events; only standardized events are displayed.
- OpenAPI now supports event list and event statistics interfaces.
- Localized tech ecosystem adaptation, security improvements, and vulnerability fixes.
【ETL】
New Features
- Adjusted the card wall layout for data ingestion, providing clearer and more user-friendly categorized viewing and search capabilities.
- Upgraded the data stream processing engine, nearly doubling data processing performance.
- Added an abnormal data operations module, unifying the display of data ingestion exceptions across the ONE platform, adding intelligent feedback on exception causes, enabling rapid awareness, precise localization, and efficient resolution of abnormal issues.
- Alibaba Cloud and Huawei Cloud data collection now support delayed collection and execution intervals.
New Optimizations
- Data ingestion tasks can now be sorted by start/stop status.
- Adapted for localized tech ecosystem databases, supporting Dameng Database, OceanBase, TDSQL, and Miradb.
【APM】
New Features
- Added MBean configuration.
- Restored the technical components functionality.
- Kubernetes deployment now includes deployment mode selection.
New Optimizations
- API optimizations.
- Call chain Call Map now supports display according to the call sequence.
- Adjusted the description of the CPU usage rate metric.
- Adjusted English version translations.
【ALERT】
New Features:
- New version of API development, other technical requirements, supporting BPI open interfaces, etc.
- Xiaorui Assistant packaged as a component and deeply integrated into the alert PromQL generation function, providing AI-assisted statement writing.
- Added commonly used overseas notification methods: sending notifications via webhook using Telegram Bot.
New Optimizations:
- Adjusted script management functionality. Considering the significant risks of public scripts, the script management feature has been removed from the public cloud. For private deployments, scripts must be selected via backend configuration.
- Removed the "pod frequent restart" rule from the built-in alert rules.
- Adjusted redirection links for the documentation center; PQL statement documentation now redirects to the new documentation center.
- The limit for the number of entity attributes in problem and alert notification templates has been increased to 10, and the character limit adjusted to not exceed 1000 characters.
- Added hover tooltip explanations for the data center field in the alert list.
- Adjusted AI prediction prompt text to support durations from 1 hour to 1 week in the future.
- Menu permission points' menu names and English translations are now consistent with the functional menus.
- Adjusted English text for alert rules, alert rule template library details, AI detection, and AI prediction.