Atlassian System design: Tagging system
7763

I had system design interview in Atlassian, goal was P50.

The intial task was something Tagging System (in my words): multiple services would like to store tags Cofluence for page, Jira for tickets and etc with CRUD and things like all pages per tag, dashboards and etc. Similar this

The interviewer started to aksing from endpoint design and during interview highlited that not intrested in whole system design. The main topics were API enpoints, pagination and remained time (not much) on databases.

The question which took the most time was about PUT endpoint which updates tags for a page or ticket (contentId)

PUT /content/{contentId}/tags
{tags:[]}

The iterviewer ask consider huge amount of tags hundred thousands and more. I replied it does make too much sense and better to have a restriction like maximum tags for the content, the interviewer insisted to consider huge amount of tags. What do you think the proper solutions for this case?

My thoughts (happy to hear feedback):
a. Using the simple PUT method :

  • Start with one million tags, each 10 bytes in size, for a total of around 10 megabytes. You can also use this as an attached file.
  • Apply gzip compression, achieving a 95% reduction, resulting in a manageable 500 kilobytes, which is not too large for a networking operation.
  • Upon receiving the compressed data, you can split it into chunks and process it in data stores asynchronously using messaging queues. This approach also leverages bulk update capabilities if they are available and necessary.
    As a viable alternative, for large files containing new data, consider uploading them to a shared storage system like S3 and then sending an HTTP reference to this file.

b. Another approach that was discussed leans towards the candidate's expectations and aligns more naturally with handling large data – the idea of splitting data into manageable batches or chunks. The primary concept here involves breaking down substantial data on the client side and then transmitting it to the server. Afterward, the data can be reassembled on a storage system like S3 and processed piece by piece asynchronously using messaging queues, taking advantage of bulk update capabilities if needed.

  • It's crucial to note that batch and bulk processing doesn't seamlessly adhere to the core principles of REST. It's often more efficient to send a single request with the file as mentioned in a above. One of the primary reasons for this is the verbosity of HTTP (each request contains metadata like headers, etc.), which leads to considering alternative approaches such as streaming methods like websockets, which are less verbose.
  • For a scalable and reliable method of uploading large files, split into batches or chunks, there's an open-source protocol known as Resumable Uploads (https://tus.io/). It enjoys support across various platforms, including Java, iOS, Go, .Net, and more. It would be prudent to explore this protocol before attempting to create a custom solution for batch uploading over HTTP.

Remaining part of the interview :
It was a bit confusing because the requirement above about hunderd thousands tags should flow in other parts. Still had some good discussions: I provided a comprehensive explanation regarding pagination, delving into the nuances of both offset and cursor options, complete with trade-offs and illustrative examples. This discussion encompassed querying pages based on tags and considerations for the dashboard.
I shared my insights on the database structure, involving the segmentation of tags, statistical pages per tag, and a fast tag-per-pages mechanism. These recommendations were accompanied by strategies like caching, sharding, consistent hashing, and replication, all of which I justified in response to the interviewer's inquiries.
Additionally, we explored non-functional requirements, touching on critical aspects such as performance, availability, scalability, and durability. These considerations were central to our conversation. Furthermore, we addressed topics related to logging and monitoring during our discussion.
The interviewer made small traps such as insisting that POST and PUT can be used interchangeably, which I was not agreed with, advocating idempotence for PUT and etc.

Comments (9)