We have a neardup pipeline at Pinterest, which produces a mapping from every image to a list of up to k near-duplicate images, such as:
near_dups = {
"A": ["B", "I", "K"],
"B": ["A", "D"],
"C": ["E"],
"D": [],
"E": [],
"F": [],
"G": ["K"],
"I": [],
"K": [],
}
Also 3 rapid fire ML questions. Take your time and answer them.