We Automated Attribute Tagging Using Deep Learning Models. (Part 1)

Product discovery is a crucial component of an e-commerce system. So, when a user searches: Red party wear Kanjeevaram saree, the Meesho app combs through millions of items and brings up relevant results.

This is possible because each and every product is tagged with its unique attribute in the system. Every product has some characteristics/attributes attached to it, and when you put a filter of “red color kurti” on Meesho app it returns all the red coloured Kurtis which have ‘red’ colour property tagged to its ID.

This means there are two things essential for product discovery -

The Product Image
Product Attribute Fields (eg - Colour = Blue, Sleeve length = Half sleeves, Pattern = Striped, etc)

Millions of sellers who upload their products on Meesho need to fill relevant attribute fields for products they upload on Meesho. But do sellers fill all the attribute fields for a product, and if they do, do they fill it accurately? Well, not really. But why is that?

Not tech savvy: Suppliers using Meesho are generally not very tech savvy and lack knowledge on process front. Sometimes they find it difficult to fill in all the fields of different product categories while uploading products on the platform.
Incorrect Taxonomy: Coming from remote regions of India, suppliers make lots of mistakes while comprehending different attributes. This results in inconsistencies in information and attributes filled by sellers.
Large Taxonomy: Some household and ethnic wear categories have a long list of attributes that needs to be filled before uploading products on platform. Some suppliers don’t have all the necessary information about their product and they find it difficult to fill all these product details.

For some fashion categories, supplier fill rate is as low as 50% (i.e. Out of 100 suppliers uploading a product in Kurtis category only 50 suppliers provide recommended information like sleeve length, pattern type etc)

What happens if only small fraction of products on the platform have complete information and the rest do not? Product discovery is hampered. Without the relevant tags, when a user searches a “Red party wear Kanjeevaram saree”, only a small number of products will show up. Limiting the selection would lead to a drop in conversion and user retention on the platform.

How do we currently solve this problem?

To deal with low attribute fill rate and information inaccuracy for product details, Meesho has built a tagging and quality check process for products uploaded by suppliers.

Tagging is done to enhance information richness of catalogs/products. Tagging is the process of filling all the remaining information fields which are not provided by the seller.

In addition to tagging, several quality checks are made on products before it is ready to go live on the platform. Tagging and Quality Checks operations is currently a manpower heavy process.

We have a dedicated team of agents that works everyday to enhance the attribute fill rate of products and ensures that information is correctly filled. But this process has significant challenges -

Manpower Training: As the product inflow and product categories increase on the platform it will become difficult to scale manpower operations and their management. Teams will have to constantly train on new categories that are added into the system.
Scale Issues: There will be an incremental cost that comes with adding more listings on platform since for every additional listing that comes on the platform, we will have to increase manpower resource for tagging and quality operation for that listing.
Time Intensive: Tagging and quality check is a time intensive process, we need to provide sufficient time to the operations team to perform all the tagging and quality checks. This leads to higher live times for supplier (Live Time is the time in which a product listed by supplier becomes active on the Meesho app)

As you can see, manual tagging and quality checks operations are not scalable and we need a more intelligent, adaptive and faster process of filling missing information for products.

If any product information is inferable from its image then the image becomes our primary source of attribute extraction. This means that the image can give us some high order insights and patterns about the product and some of these insights can be used to fill the attribute fields for the product. For example -

How we solved this using AI?

We took up the challenge of consistently predicting attributes from product images with an AI model.

We formulated this problem as an Image Classification problem which we solved using state-of-the-art Deep Learning (DL) computer vision models like Residual Networks (ResNet-50, ResNet-101, etc.) to learn the mapping from a product image to its attributes. We also use various background removal and object detection algorithms in Deep Learning to enhance our process. Think of Google Lens and how it is able to detect and decipher in-image information.

To train our DL models for automatic product attribute tagging, we prepared a training dataset for each attribute of each category. The training dataset is a set of ground truth data that a machine learning model uses to learn to make predictions.

We started our training and testing with the Kurtis category since it has a relatively higher upload volume and a poor attribute fill rate by sellers. More data means higher confidence. It was the perfect test subject category for a proof of concept that would validate this algorithm.

What was the outcome and impact?

We quickly trained our ResNet models and got an accuracy comparable, and in some cases, better than agents for Kurtis category. The model was put into production and its output was consumed in the tagging workflow. This way we saved 1200 mins (per day) of manual tagging process and 100% of the cost associated with tagging operations of Kurtis.

How do we scale the automation?

Now imagine the cost and time benefits from scaling these DL models for all categories on the platform? That would be immense, but it’s not an easy task. Why?

There are over three thousand product categories on the platform, which means we would have to build custom models for these categories and their product attribute fields. This can become very tedious and would take months to complete.

How did we solve this at scale? our in-house AI team came up with an interesting approach for parallelised training and experimentation, which reduced the combined training time by a whopping 90%!

With a parallelised framework for major categories, we were able to reduce 60% of the taxonomy tagging and 46545 manpower mins per day! Find out how we built these image science models, how we scaled it and what challenges we faced in the next part of this blog series.

This blog was co-authored by Ashwin Srivastava and Arun Patro.

Think you have the chops to work on these exciting problems? Do you have an insatiable itch to discover and solve these frontier challenges? If yes, then hop over to our careers page and apply for a position that fits you the best.