Confluent | Onsite | System Design
Anonymous User
5707

We use a distributed worker platform to asynchronously process tasks (“units of work”) outside of the scope of a user request. A starting point for such a system could be the following:

The application/clients generate a “message” (payload) which is queued to a distributed queue. This “message” contains information about the request and the servers in your platform are aware of how to process this payload. As an example: my app could send a “ComputeVector” message with the payload as {“object_id”: 1} - and the server would be aware of how to process this message, compute the vector and save it to the database.

Initially, we used this platform to “index” documents to our search index. As the platform grew, it enabled a large number of use cases which required async processing - generating ML vectors, syncing data from external systems, generating analytics reports, email processing etc. With this, came new considerations that needed to be taken into account. Some of them are the following:

  • System needed to handle requests of different priority - Higher priority requests get served faster
  • As your system grows - debuggability is of paramount importance

Given the above, how would you design a system that can scale and handle the above considerations?

I got a reject, not sure if I failed the System Design but I think it is. Does someone have a good scalable solution to discuss? Thanks!

Comments (8)