You can expect India’s e-commerce market to grow to $120.1 billion over the next five years. As businesses turn online, we demystify the shipping hassles of millions of sellers so they can focus on growing their business. At Shiprocket, we witness over lacs of shipments daily with an average seller processing more than a dozen orders per day on our platform.
Sellers usually ship multiple orders at once by uploading a CSV file with details of all their orders. It does work like a charm. However, we felt an ardent need to vastly improve the end-user experience of the entire process. Additionally, the overhead of the entire process on our system was an equally disturbing sight for our engineers.
This compelled our engineering team to find a more optimal solution. So we decided to revisit this problem to solve it more intuitively for our users. In this post, I’ll walk you through how and why we used web sockets for processing bulk operations on Shiprocket.
A Primer on Polling
Currently, any bulk operations performed on the Shiprocket India panel uses a technique called Long Polling or Client Side Pull. Polling mimics a bidirectional communication between the client and the server on a web application. Here’s a quick refresher on what it is.
The client makes a request to the server and the server stalls this request until it’s ready to respond. When the server responds back with data, the client immediately sends a new request, and so on.
This allows you to inform your users of the ongoing process under the hood. As a result, you’re able to create a more engaging experience for your users. Since you only make the next request once you get back the response from the previous one, you also minimize a lot of unnecessary load on the server.
Polling Did a Decent Job
When a seller uploads multiple orders, our frontend makes an HTTP request to our server to process a batch of orders. Consequently, we display a popup that presents the real-time progress of that process.
As the server starts processing this batch of orders, the frontend makes a request to another endpoint. This request tracks the status of the current batch of orders. When the server receives this request, it establishes a polling connection with the client. Here’s a glimpse of a function that handles this on our frontend:
//Use Polling to check the status of the bulk operation
function checkUploadStatus() {
//Call our service for checking the status
Service.checkUploadStatus(url)
.then(function (res) {
//Response Callback from Polling Request
//Handle all types of Response from the request
…
if (data.updated) {
//Process is completed, close the popup
$uibModalInstance.dismiss('cancel');
//Notify the user for completion of the progress
Notify.alert(‘Upload Successful!’, {status: 'success'});
})
…
.catch(function (error) {
//Catch Exceptions from the called service
})
}
When the server is done processing the current batch of orders, it sends back a response to the frontend. The front end then immediately sends a fresh request to process the next batch of orders. This goes on till all the orders are processed and voila, the seller has conveniently uploaded her bulk orders!
But Not a Perfect One
There were some common pitfalls associated with this approach. The request that fetches the progress of the process updates a record in our database and sends it back to the frontend. The front end uses the response from this API to notify the user of the progress inside the popup. As sellers started using our service more often, we experienced a lot of load on our servers and this entire process became more resource-intensive than we anticipated.
Another issue was the lack of a seamless user experience for the seller. Since we had to explicitly make a client-side request for fetching the progress, if a seller intentionally or accidentally closed the tab and moved to another one, there was no way for her to know the live status of the process anymore. Additionally, until the bulk operation was completed in its entirety, the seller was restricted from performing any other action on the panel.
To combat these problems, we decided to implement a solution based on Websockets for any kind of bulk operation on the panel. But first, let’s understand what WebSockets are and how they work.
Enter Websockets
Websockets create a persistent bidirectional communication between your client and a server. As opposed to HTTP requests in polling, they create a channel for communication built on top of TCP/IP. Therefore, your client no longer has to keep on making requests as your server can directly pass messages using the channel.
How it Works
There are primarily three events that happen during the course of a web socket connection. First, the client asks the server to establish a WebSocket connection. Both client and the server create a WebSocket object for this purpose. This initial setup happens via a single HTTP request-response structure. If the server agrees, a communication channel is opened between the two.
Then, both client and server interact with each other via the communication channel without the need for an HTTP request. When the server sends back some data via the channel, the client receives it by emitting a MessageEvent.
At any point in time, when both the client and the server decide to close the WebSocket connection, the server initiates to shut down the communication channel. The TCP connection is torn down and the client-side WebSocket object receives a CloseEvent.
Applications and Advantages
A single server can establish multiple WebSocket connections with the same client or the same WebSocket connection with multiple clients. This allows you to use WebSockets for creating real-time chatting applications or multiplayer games.
The more relevant use case here is uploading large files to the cloud. When you send data through WebSockets, it’s divided into a number of smaller chunks of data. You can leverage this to send parts of a large file to the server. Through the WebSocket connection, the server can send some data indicating the progress of that file. Since you’re not constantly bombarding your server with HTTP requests, the typical latency issues associated with Polling are eliminated and your server is freed from the unnecessary load.
Potential Solutions
After we doubled down on Websockets, we considered a few libraries for its implementation.
To compare holistically, we prioritized our comparison based on the following factors:
- Alignment with our use-case and tech-stack
- Infrastructure and Scalability
- Ease of use
- Security and Performance
- Cost
- Advanced Features
We compared Socket.io, Ratchet (laravel-websockets), Ably, and Pusher based on the above factors.
Comparative Analysis
With respect to our use case, performance, and ease of use, all were equally weighted.
Our frontend runs on AngularJS but our backend uses PHP Laravel. Socket.io is written in JavaScript which meant we needed a NodeJS server to use it. Similarly, Ratchet was a library specifically for Laravel and we wanted something more generic. Since Ably and Pusher offered both server-side and client-side implementations suited to our tech stack, they had the upper hand here.
Socket.io and Ratchet are both self-hosted, which means we needed our own servers to host the service. Our focus was more on implementing the solution rapidly and we didn’t want to worry about the infrastructure, scalability, and maintainability of the service.
Considering our large user base and DAUs, we wanted to keep the security of our service intact without heavily investing in our engineering resources. Therefore, choosing a cloud-hosted service from a security standpoint was a no-brainer.
Brownie points to both Pusher and Ably, being cloud-hosted they did all that heavy lifting for us. This way we could focus on optimizing the service, making it more UX friendly, and most importantly, delivering it on time for our sellers.
In terms of costing, both Socket.io and Ratchet are open-source libraries so they are completely free to use. However, before we came to the price comparison we had already discarded these options on the grounds of the above-mentioned reasons.
Finally, Pusher being significantly cheaper than Alby, became the most appropriate choice for us. It also provided real-time stats, debug console & push notifications out of the box.
Implementation
We used Pusher’s server-side PHP SDK on the backend and client-side JavaScript SDK on the frontend.
When a seller uploads bulk orders, our Frontend creates a bulk request and checks whether any pusher channel connection is active for the seller. If not, then it creates a unique WebSocket connection with the server.
//Check if we need a new pusher connection
if (!pusher || (pusher && pusher.connection &&
pusher.connection.state && pusher.connection.state ==='disconnected'
))
{
//Create a new pusher connection
window.pusher = new Pusher(pusherKey,options);
}
Once a connection is established, we bind the user with requested events. Then, we call our backend API to process the bulk operation.
// Binding Pusher Channel Event
pusherChannel.bind('bulk_order_upload',function(data){
...
})
At the server end, we keep a track of the active processes for every seller. Since the entire process is asynchronous, our server dispatches information such as success count, failure, pending, to the corresponding events in the pusher channel.
//Dispatch information from server
public function broadcastWith(): array
{
return [
/*
Return properties such as message, success, failed, etc
for the current bulk operation process to the corresponding
events in the channel
*/
...
];
}
Our front end is continuously listening for these events. It updates the UI in real-time to notify the seller of the status of the process. For instance, in the case of a bulk upload, we notify the seller of the number of successful and failed uploads, uploads that are remaining, etc.
Connection Optimizations
Pusher has a limit on the number of concurrent connections it can handle at a time. Therefore we needed some methods to optimize the user connections both on the frontend and backend. Here’s how we did it:
- We keep the connection active only for non idle users. Thus only when the seller creates a request, we initiate a connection.
- We make a logical assumption that no bulk operation process will take more than 2 hours to complete. Thus we set a cap on the maximum duration for which a connection is active.
- If the seller has multiple tabs opened, the currently active tab takes precedence over the others. Therefore we only open the connection for the active tab and close it for any other inactive tabs.
- If the user switches tabs, we wait until a minute before closing the connection of the inactive tab.
- When a seller refreshes the page or initiates a session from another browser or system, we make a request to a backend API to know if a we need a new connection for this seller.
Conclusion
We compared the traditional polling approach with the Websocket approach in terms of server load, latency, and end-user experience.
We managed to create a more delightful and seamless experience for our sellers. Moreover, we reduced our server load substantially. Polling roughly attributes to 30K – 100K requests per day on our server, and WebSockets eliminates all of them.
We didn’t involve any overengineering complexity for our implementation. Hence we plan to replace it against polling on the entire Shiprocket platform as well in the near future.