Create an error log alert system
Anonymous User
1560

Recently in an interview, I was asked this question in design round:
You have an application running on < 100 machines. Your application logs different level of logs like Fatal, error, notify, info, debug etc.. The logs will have format similar to - timestamp, machine name, log level, error code if applicable.
For every error, there are error codes defined. There are around 500 error codes.
If the fatal/error count exceeds the given threshold K in last 2 hours, your system should raise an alert.
When you raise an alert, you should tell machine name, error code and count of error codes.

I proposed an approach using sliding window idea. Storing the timestamp and counter in a map and then raising alert whenever the counter > K in the (current time stamp - 2 hours) timeframe.

Any pointers about how this can be solved?

Comments (5)