Gridsum Big Data Platform
Our software solutions are built on the Gridsum Big Data Platform, our proprietary technology designed to acquire, store, process and analyze large and rapidly growing volumes of both structured and unstructured data, leveraging our highly scalable correlation analysis engine and complex event processing capability. A distributed computing architecture is required in order to analyze in real time the rapidly growing volume and complexity of digital data. We were among the first digital intelligence companies in China to build solutions entirely on a distributed data warehouse architecture using the open-source Hadoop framework, allowing us to perform multi-dimensional correlation analysis in real time, on data sets regardless of size, and to implement real-time interactive data mining at large scale.
The core capabilities of the Gridsum Big Data Platform include:
Multi-dimensional correlation analysis — Our correlation analysis engine enables us to dynamically correlate large quantities of structured and unstructured data on an unlimited number of dimensions. This high performance capability, enabled by our large-scale distributed data warehouse architecture, allows us to run multi-dimensional data drill down and data correlation analysis in real time, on datasets regardless of size. We perform this analysis on all of the data in the dataset, without resorting to sampling. Our correlation analysis capability is further enhanced by the size and quality of our datasets, including data acquired from consenting customers and third parties, public information that we have collected from web crawling and data derived from these datasets. Our data includes the correlations that we have retained from our past projects. Because we have been accumulating our datasets since 2009, with a focus on data closely related to our customers’ business operations and customer interactions, we believe that our data assets are among China’s largest and highest quality.
Machine learning capability empowered by highly relevant large-scale historical data assets and industry experience — Our machine-learning algorithms learn from experience, identify patterns of interest and make data-driven predictions within their defined parameters. We leverage our data assets and industry expertise to design machine-learning algorithms that solve specific industry problems for our customers. In conjunction with our natural language processing technology, machine learning is particularly suitable for processing unstructured data by recognizing patterns and connections through which the raw data can be structured and analyzed.
Natural language processing technologies — Natural language processing, or NLP, for the Chinese market is extremely complex due to fundamental characteristics of the Chinese language, including multiple meanings of the same Chinese characters, contextual association of characters into words and lack of punctuation. We have solved these long-standing problems by developing proprietary NLP technologies based on algorithms and machine learning techniques that are designed to understand and analyze the complexity of the Chinese language and its usage in various contexts. Our NLP technologies enable the extraction of information about entities, correlations, sentiments and emotions from vast amounts and variety of digitized documents, text converted from audio and video streams and other digital content in targeted industries such as legal and media. With our NLP technologies, we are able to extract structure from unstructured data, so that it can be processed and analyzed effectively.
Real-time complex event processing — Our complex event processing technology tracks all available data about events as they occur and applies sophisticated rules to identify patterns that signify problems, threats and opportunities for our customers. This technology is well suited for analysis of large-scale concurrent streaming data in real time.
All of these capabilities of the Gridsum Big Data Platform are easily extensible and are highly scalable, allowing us to enter into new industries and serve new customers.