Scalable SaaS Modernization

Success Story

Helping a Global Fashion Retailer Optimize Cloud Costs and Performance

Redesigning a critical Order-Invoice Kafka Producer service to be many times faster and substantially reduce cloud infrastructure costs.

4X lower

cloud costs with optimized Kafka-based service architecture.

7X faster

event processing (150K events/hour vs 21K before).

Technologies

Kafka

Java

AWS

Kubernetes

Splunk

Terraform

React

PostgreSQL

Spring

TypeScript

Swift

Okta

Google Cloud

Expertise

Event-driven architecture

Microservices

Highload

Cloud

Backend

Get this case study in PDF to your email

Have a Similar Problem?

Reduce dependencies and replace legacy systems with highly scalable microservices.

Contact Sales

Increase durability, scalability, and efficiency

The company needed to redesign a critical service to increase its durability, scalability, and efficiency.

Dependency removal on obsolete systems

The redesign was required to remove dependencies on the services planned for deprecation.

Zero-downtime transition mandate

The legacy service needed to be replaced with a new one with no downtime.

Complex, error-prone legacy design

Initially, the design of the Order-Invoice Kafka Producer (OINK v1) was complex, error-prone, and difficult to debug.

Performance lag and event reliability

OINK v1 experienced heavy processing lag, handling only ~21,000 RTCIM events per hour on 12 pods, which led to backlogs and missed events that were hard to trace.

Have a Similar Problem?

Reduce dependencies and replace legacy systems with highly scalable microservices.

Contact Sales

Architectural mastery

Zoolatech was chosen for its core expertise in event-driven architecture, microservices, and high-load systems, which were essential for redesigning the complex OINK service.

Execution reliability

Our expertise in the backend, cloud/AWS, and data engineering demonstrated our ability to plan and execute a reliable, zero-downtime transition.

Zoolatech is a senior-heavy engineering firm with Silicon Valley roots and a Miami HQ, specializing in legacy modernization, system re-architecture, and AI deployment to drive long-term, compounding value.

2017

Year Founded

600+

Employees

96%

Client Satisfaction

Phase 1

Analysis and architecture design

We thoroughly analyzed the service architecture and its limitations. We helped our partner to design the new service architecture, establish the development process, and plan and execute the transition to the new version.

Phase 2

Decoupling and compatibility

Zoolatech helped the client to decouple OINK from the system called OMS and created OINK v2, which was completely compatible with version 1.

Phase 3

Testing and shadowed deployment

We created the new testing service “comparator,” which compares the RTCIMs generated by version 1 and version 2. The second version of OINK was tested on the production environment in a shadowed mode, and several upstream issues were identified and resolved.

Phase 4

Cross-team collaboration

During this task, the Zoolatech team worked closely with OMS, Payment, Tax teams (upstream), ERTM (Enterprise Retail Transaction Management), and Sales Audit departments (downstream).

We re-architected a critical Kafka service to dramatically boost throughput while cutting cloud costs by four times—delivering a faster, more reliable system with zero downtime.

Decoupled architecture and compatibility

Zoolatech helped the client to decouple OINK from the system called OMS and created OINK v2, which was completely compatible with version 1. The quality criterion was the inability of downstream systems to recognize RTCIM parcels created by different OINK versions.

Testing and traffic management

We created a reliable method of redirecting production traffic from Oink version 1 to version 2. This required creating the new testing service “comparator,” which compares the RTCIMs generated by version 1 and version 2.

Enhanced reliability and DLQ

We developed a DLQ (Dead Letter Queue) layer all over the OINK v2 service components, thanks to Kafka Connect’s DLQ support, ensuring that no single event was missed during processing.

Dependency and scaling optimization

Thanks to loose coupling in the new event-driven architecture, we easily changed the dependency on 4 upstreams to just one incoming stream of events. Also, we achieved faster processing by using the right number of Kafka topic partitions for pod scaling.

Cloud cost reduction

We helped our partner to significantly reduce cloud costs. Three pods of OINK v2 perform approximately the same as $12$ pods of OINK v1, reducing costs by approximately four times.

Performance improvement

Thanks to using the right number of Kafka topic partitions for pod scaling, we achieved seven times faster RTCIM events processing with $12$ pods.

Resource optimization

With the same number of pods, OINK v2 uses over four times less memory. Total RAM consumption dropped from 60 GB to just 14.4 GB across 12 pods.

Efficiency and reliability

The partner achieved a more efficient and reliable critical service, leading to cost savings, increased customer satisfaction, and business performance.

Strategic cleanup

The successful replacement of the legacy service with the new one, with no downtime, and the elimination of dependencies on services planned for deprecation helped our partner launch the deprecation of obsolete systems and achieve further cost savings on the infrastructure.