How Lets Transport Achieved Better Productivity and Lower Cost Structures

Nilay Sahu
4 min readJun 23, 2023

Good news is Lets Transport is getting amazing adoption and a steep rise in its customer base. With it comes the good to have problems of ensuring greater reliability and increased efficiency. We wanted to accelerate the transformation to avoid any extra cloud spend or dip in our team efficiency.

We realized there was a need for a DevOps led robust Engineering Platform to enable our journey hereon and provide the required stability & great developer experience. Build Vs Buy is a very common debate in all such scenarios. Agility of approach and ability to deliver outcomes at a stellar pace differentiates the leaders from others. While buying is a quicker and cheaper option, it generally remains short of what is ideal. We are happy that in BuildPiper we found the exact fit we needed to set up the efficient Engineering Platform at Lets Transport.

The problems we wanted to solve:

  • Gain insights into the usage pattern of our services, their underlying resources and gain predictability in terms of controlling cloud costs.
  • Required a truly auto-scalable infrastructure (with a focus on downscaling quickly with multiple parameters) to accommodate the growing traffic without compromising performance.
  • Enable rapid iteration and deployment cycles for faster bug fixes and new feature releases.
  • Freeing up the time of senior engineers and overall better engineering productivity to be able to focus on ‘Scaling with efficiency’ initiatives.

BuildPiper. It got us through many of our challenges and unveiled many of the critical loopholes that made our tech platform vulnerable. Let’s unravel our journey to how we scaled reliably, empowered developer productivity, and kept our cloud bills in check.

Building a Cost-Optimized Scalable Infrastructure

Cloud cost reduction is not a one-time initiative, it has to work like a process on a recurring basis. And the best way to achieve this is through applying automation.

14% Cloud Cost Savings with Environments Downtime Scheduler

Manually shutting down non-production services with Jenkins became a repetitive task, which is why eliminating manual processes was our top priority.

Through BuildPiper’s auto-scheduler we defined the odd hours to shut down our environments. This simple step saved us around 14% of our total non-production costs on a monthly basis. Also created space for considering new services.

Time-based HPA

To bring down production costs we defined time frames of our peak user traffic, to scale up and down the number of pods as per the schedule.

One-click Creation or Teardown of Infrastructure from Days to Hours to minutes

Creating new infrastructure for pre-production testing involved around 3–4 developer days (> INR 100,000). So we wanted to look into the possibility of cutting it down to hours.

That means if we could literally duplicate the environments, it would make the case. It was possible with BuildPiper. With a single click, we had many environments to test multiple features parallelly. A central view of these environments enabled us to delete them after our test was completed.

From days to hours to minutes — we experienced zero-cost maintenance of non-production environments and achieved Faster time-to-market for releasing new features.

Enable Rapid Iteration And Deployment Cycles

Experienced 40% Faster Build on Virtual Machines than on Cloud Build Services

We wanted to have the ability to tailor individual steps of our build process. The job templates of BuildPiper were highly customizable, wherein we used the step catalog to define the sequence of our build process. With that, we gained 40% faster build as compared to other cloud build services.

Swift Production Rollback, Complete Visibility, and Minimal Downtime

Previously, our build steps were coupled with the deployment process. Wherein, the build process took 90% of the entire deployment process. Rollback to the previous build was quite complicated, time-consuming, and costly.

BuildPiper decoupled the build and deploy process. It supported better version control by tagging builds, this eventually helped us in quick rollback and deploying the right version.

It boosted developer productivity with faster config changes & quick service restarts on production, and prevented potential downtime.

Freeing Up Time of Senior Engineers

Ensuring Standardized Deployments along with 20 hours per month savings

Managing 25 services x 4 environments through Jenkins made achieving “faster time-to-market” a dream. It took us nearly ~2 developer days to perform its upgrades. Our biggest challenge was having 5 inconsistent versions of Jenkins deployed across different environments, requiring the bandwidth of senior engineers even for some complex deployments.

Here we reimagined the entire process with the template-based approach of BuildPiper, which also aided in standardizing deployments across all environments. From its UI, we picked a predefined template and customized it with the right job details.

This approach saved us nearly 20 hours per month in the Developer Toil.

Following Best Practices to Prevent Duplicate Tasks

Previously we did not invest in proper backup mechanism for the jenkins job configurations. As a result, once we lost all the development job configurations. It then took us a week to recreate everything.

Things are different now, BuildPiper stores all the configs in the SQL server which is regularly backed up to GCP.

Peace of Mind

One of the most neglected aspects of our platform was security, we wouldn’t have realized it — if not for BuildPiper. Our Jenkins setup was open to the internet with simple and unrestricted passwords, making it easily accessible through the public network.

Now we have established a proper VPN network for both production and non-production clusters through BuildPiper. Even with a GCP project service account, it’s not easy to access (and run kubectl commands) without VPN. With this setup in place, we’re far from worried about any black swan event like the system getting hacked.

Future Plans

We are planning to enhance platform functionality by streamlining maintenance efforts and bringing transparency.

  • Supporting our debugging efforts with integrated testing, and extensive code quality checks via Sonarqube for a robust error-handling mechanism.
  • Ensuring extensive visibility within our infrastructure through Prometheus and understanding the traffic flow via Istio, to set up proper scaling patterns.

There are still a lot of unexplored use cases that we’re keen on experimenting with, and understand how we can further streamline our platform engineering.

--

--