Design analytics system

Design an analytics system.

  1. The input to the system is fed from another service and contains Personally Identifiable Information(PII) such as email,name etc..
  2. The input comes in the form of an API request -
    eg. { "email" : "abc@gmail.com",
    "phone": 9888,
    "name":John
    }
    The service should return the following metrics for the last 1 week, 1 month, 1 day and 2 yrs -
    A). No. of requests with the given email-id ..
    B). No. of requests with unique names for a given email-id. Some requests may contain same email-id but many different names and phone numbers. That is likely to come from a fraudster. These metrics help us in fraud detection.
    C). Percentage of request with a given name of the total number of entries for the given email id.
    eg, For email id - abc@gmail.com, there may be a total of 100 records out of which 50 come with the name John,20 with the name Robert, 30 with the name Daniel. So, for John, it will be 50%, Robert - 20% and Daniel - 30%
    Note: You can assume that the data older than 2 yrs will be automatically deleted from our datastore.
    **Questions : **
    **1. What are the services ?do you need just one service - analytics service ?
  3. Time Series DB ?
  4. Can Oracle be used? NoSQL?
  5. Can Kafka fit in somewhere?
  6. Performing aggregation in the DB vs code. eg...
    No. of requests with unique names for a given email-id. This query filters by email id and then does aggregation on it using the name(count(*)). Doing aggregation in the query vs offloading it to the application code. Pros and cons
Comments (6)