Google | Phone | Shortest substring not in data
Anonymous User
3283

Given a chunk of data (~4GB) find the shortest substring that does not exist in the data. Due to the size of the data the main challenge is to reduce the size of the search space, i.e. estimate how many bytes the shortest substring has that does not exist in the data. After that's done we need to find an appropriate data structure and implement the solution.

Fo example, given the data acbaaccb and the alphabet {a, b, c}, one shortest substring not in the data would be bb. Now imagine the data being much bigger (~4GB) and the alphabet consisting of any byte between 0 and 255.

I got hung up on the first part for quite a while and thus didn't have any time left to program the solution. The whole problem only became fully clear to me shortly after the interview when I could reason about it quietly on my own, which shows how much difference the time pressure of the interview makes.

Comments (15)