I have not been able to solve it throughout the interview, how would you solve this?
At TikTok, we often have people label data for ML models to use. In order to ensure correctness of the labelled data, we test a person’s accuracy.
To do this, we want to introduce errors in data that is known to be accurate and then see if the person we are testing is able to catch the error.
We corrupt data by removing a node and its children. Node names are unique among their siblings.
Write a function to return:
corrupt() will remove an eligible node uniformly at random. And it will return the corrupted menu and a description of the error to be used by grade().
def corrupt(menu):
passGiven a person’s response and metadata describing the error we introduced, we evaluate the person’s response.
We only want to check to see if they ignored our introduced errors, so checking that a node with the same name is reintroduced at the correct spot is enough.
(Optional)
The person can make other changes and not be penalized if they aren’t relevant to our introduced error.
If the person deleted a node that had a corruption, then we don’t penalize them since that may have been a correct change.
def grade(maybe_fixed_menu, metadata):
passsample_menu = {
"node_type": "rootHead",
"name": "root",
"children": [
{
"node_type": "subRoot",
"name": "Breakfast",
"children": [
{
"node_type": "item",
"name": "Eggs Benedict",
"price": 8.75,
"children": [
{
"node_type": "item_extra",
"name": "Add-Ons",
"children": [
{
"node_type": "extra_option",
"name": "Add Bacon",
"price": 1,
"children": [],
},
{
"node_type": "extra_option",
"name": "Add Cheese",
"price": 1,
"children": [],
},
],
}
],
}
],
}
],
}