Meesho was born against odds. We’re making our way in an industry largely written-off, before we showed up 😎
Charting this course is driven by strong engineering. Therefore, our success is heavily dependent on building outstanding software solutions at scale, and in the most sustainable fashion.
After all, we cater to a market that others have little penetration, or understanding — Bharat.
Scale and sustainability go hand-in-glove at Meesho 🧤
These guidelines are observations and learnings from our journey. We started with a monolith. Hacked our way to ship fast, rewrote and re-architected most of our systems to handle current and future scale.
We use the best toolsets to solve a problem. We don't believe in running after the newest shiny toy in the world of tech. As such, these fads occur every 3 months - and not participating in one, is also a collective will of how we look at tech tooling.
All tech has to solve a problem statement. It’s easy to get swept up in the euphoria around new technologies. We believe in looking at problem statements, rather than solutions searching for a problem.
It’s also important to note: these are guidelines. They’re not sacrosanct. Instead, they help us do better, strive for excellence 🙌
Our world hinges on constant change and learning 🙇
These guidelines give us direction, but are not set in stone. Being amenable makes us more nimble — be it business or tech.
In defining our core, we have deliberately avoided the usage of 'principles' simply because of how fast tech evolves. This is true of Meesho as an organisation as well. When we have 10x ambitions in every vertical, we can only frame guidelines, not absolute truths 🙏
Guidelines 📜
Strict separation between code and config 🔒
An app’s config is everything that’s likely to vary between deployments (staging, production, developer environments, etc). Examples: credentials, thread count. For security concerns, we do not commit config to the same repository as code.
A good test for whether an app has all config correctly factored out of the code is whether the codebase could be made open source at any moment, without compromising any credentials.
Backend tech stack 💻
We prefer Java as the language of choice, unless there is a good reason not to use it. We use only open-source managed services in AWS to make it platform-independent, in case we have to migrate someday. For example: Hbase instead of Dynamo, Kafka instead of SQS.
- Spring boot for new Java microservices
- Spring JPA as ORM layer
- Resilience4J for Circuit Breaker
- Redis for caching.
- MySQL for transactional storage
- Hbase for time series and column family use cases
- Apache Spark and Presto for big data batch jobs
- Apache Flink for stream processing
- Elastic search for search use cases
- Kafka as message queue
- MongoDB for document DB use cases
No release without QA ✔
We are human. Mistakes happen during development.
Checking that a feature works the way it should ensures we can move fast, and maintain high-quality standards. Our QA teams ensure our customers on slower devices and networks get a similar experience to those on broadband networks. We have optimized our APIs to minimize network woes.
We have an independent QA environment to test features out, without corrupting production data.
Architecture review 🏛
For all major features + new microservices, we prepare an architecture document. This gets reviewed with the team.
This is because we believe in sustainable code.
Creating good documentation also means working as a team, and improving clarity in thought. As an engineer, we get consumed by code without understanding the importance of working as a team.
The goal here is simple: catch issues upstream + increase awareness/knowledge of the whole team.
API SLAs ⏳
When we build for 100 million users, every millisecond spent on a server needs to be thought through.
We have set SLAs (Service Level Agreements) of 80ms P95 for our user-facing APIs. This means downstream services have even shorter SLAs of 20-40ms for the most part.
Observability 🧐
All production services need to support our standard telemetry suite - AWS cloudwatch, in-house log analytics, Telegraf, Prometheus and Grafana.
We also have standard metrics that become available with our internal instrumentation libraries. Every team adds monitoring dashboards to ensure our uptimes are performing as expected, and alerts go through our incident management process for triaging.
With this, each team is empowered to set their own goals for uptimes, and monitor them 24/7.
Coding guidelines 📝
We follow OWASP coding guidelines to minimise our attack surface area.
Across all languages, writing readable code is rule #1. We can easily write 100 rules, but truth is, they all boil down to one — write readable code..
This is easy to say, difficult in practice. We should strive to be as simple as possible.
The long and short 🧑💻
Everything is done by a team. We have passed the point where an individual superhero can change the course of a system. There are no 10x engineers, only high-performing 10x teams. Team players are our new superheroes. A point that needs calling out — we believe in flexible working hours.
If something is repeatable, it should be done by a program. Never a human. This is the genesis of good engineering.
Lastly, stand up and speak your mind. Don’t build something you are not convinced of 🙏
If our principles and guidelines appeal to you, why not consider working with me to build Bharat's preferred e-commerce site? Head to meesho.io to check our openings!