Using Gen AI to automate
text analysis for
actionable BI insights
2 months
2 experts
Business Intelligence
Business challenge:
The company was faced with the need to automate the processing of text data in order to use the results in the BI system.
Zoolatech approach:
The approach included a number of strategic decisions:
  • Consolidation of all data, the analysis of which is necessary, into a DWH
  • Introduction of automated data pre-processing using Google Dataform
  • Implementation of data processing functions and patterns using LLM
Value delivered:
The introduction of an automated data processing process, including text processing using LLM, made it possible to significantly reduce the amount of manual work, consolidate heterogeneous data sources and apply the same tools to them. As a result, the company gained reliable access to more sources of valuable business information.
Facing similar challenges? Contact our experts now.
Business Challenge

Analyzing corporate data often requires examining various texts, such as customer reviews, open-ended survey responses, and third-party product descriptions. Traditionally, this involves complex methods to prepare the text for analysis: translating, removing unnecessary characters and stop words, breaking it into words or phrases, and standardizing the text. After preparation, techniques like frequency analysis, sentiment analysis, classification, summarization, and named entity extraction can be applied.

Automating this process is challenging. Each specific task often requires a specialized algorithm, and complex analyses might still need manual processing. Due to these complexities, text data is underutilized in business intelligence (BI), despite its potential to provide valuable insights for decision-making.

Zoolatech Approach

With the introduction of large language models (LLMs) in natural language processing (NLP), Zoolatech has developed a streamlined approach to simplify and automate text analysis for BI.

Consolidation of Data

All input data and processing results are stored in Google BigQuery Data Warehouse (DWH). BigQuery offers built-in features for machine learning and integrates seamlessly with Vertex AI.

Automation of Data Pipelines

Data processing, both batch and manual, is automated using Google Dataform. Dataform allows analysts to write code in SQL and JavaScript, store scripts in Git/GitHub, and run them manually or automatically. It also supports running unit tests, significantly simplifying the analysts’ workflow.

Implementation of LLM-Powered Data Processing

The Vertex AI platform, fully integrated with BigQuery, enables the use of the Google Gemini model in the BI cycle. This integration allows for advanced data processing functions and patterns, enhancing the efficiency and quality of text analysis.

By leveraging LLMs and advanced data management platforms, Zoolatech has significantly simplified the process of text analysis, making it more accessible and effective for business intelligence purposes.

Value Delivered

The applied solution made it possible to reduce the time for certain types of text data by up to 80%.

The results allowed text data to be analyzed and displayed alongside traditional numerical data in LookerStudio. In particular, examples have been developed that include the following types of diagrams and analysis:

Frequency diagrams based on categorization

For example, histograms that classify text responses. To do this, the LLM was first tasked with identifying common categories from the mass of responses. Then, once the final list of categories was compiled, the LLM classified the responses according to the resulting list.

Annotating text

This method is used to create generalized recommendations and reviews.

Summarization and anonymization of responses. In cases where it is necessary to not only analyze each answer, but also display them, this helps to understand the answers and suggestions in more detail. It is possible to maintain anonymity and remove identifying information and style of the message.

Tag clouds

This is an alternative to frequency analysis when it is impossible to clearly categorize responses, but you want to visualize trends. To obtain them, the LLM is tasked with reducing each answer to one or two words.

Grouped responses based on sentiment analysis

Thus, it is possible to conduct analysis based on the number of respondents in each group, or even use different analysis methods for each group.

Contact us
Let's build great
products together!