Amazon | OA | USA solved!

Anonymous User

4831

Amazon United States Online Assessment Interview

I just had on OA with Amazon. I'm still unable to come up with a timely solution for the second problem, so I'm asking you guys.

Problem 1
Find the maximum number of pages read of all the days in the linked list. The input is the number of pages read in the form of a linked list where the number of pages read at the start of the day is at the front and the number of pages read at the end of the day is added to the tail. So it would look like day3->day2->day1->day1->day2->day3-> The length of the linked list is always even.

For this question you need to use a fast and slow pointer, find the middle, reverse the linked list from the middle to the end, then take the maximum sum from all the sums in both lists.

input:
a->b->c->d->d->c->b->a->
split:
a->b->c->d-> | d->c->b->a->
reverse:
a->b->c->d-> | <-d<-c<-b<-a
answer = max of (a+a, b+b, c+c, d+d)

import random
class Link:
    def __init__(self, data=None, next=None):
        self.data = data
        self.next = next

    def __str__(self):
        node = self
        res = ""
        while node:
            res += f'{node.data}->'
            node = node.next
        return res


def createDataAmazonQ1Data(N):
    N -= 2
    if N & 1:
        N -= 1
    read = random.randint(1, 5)
    head = Link(read)
    node = head
    while N + 1:
        child = Link(random.randint(1, 5))
        node.next = child
        node = child
        N -= 1
    return head


def maximumPagesReadAmazonQ1(pages):
    # print(pages)
    head = pages
    fast, slow = head, head

    # find the middle
    while fast:
        slow = slow.next
        fast = fast.next.next

    # print(f'slow {slow}')
    # print(f'fast {fast}')

    # reverse the links for the second half
    prev = None
    node = slow
    while node:
        next_ = node.next
        node.next = prev
        prev = node
        node = next_

    # print("slow", slow)
    # print("prev", prev)
    # print("node", node)

    # get the sum of data from each part
    left = head
    right = prev
    ans = 0
    while left and right:
        # store the maximum sum
        ans = max(ans, left.data + right.data)
        left = left.next
        right = right.next

    return ans
	
	if __name__ == "__main__":
		print(maximumPagesReadAmazonQ1(createDataAmazonQ1Data(8)))

Something like that.

The second question is much harder.

Find the maximum difference of character frequencies for all substrings in a given string.

s = "aab"
maximum difference is 1 because count(aa) - count(b) = 1

s = "aabbbcacba"
maximum difference is 3

A brute force approach looks like this

from collections import Counter
def maximumFrequencyDifferenceBrute(s):
    ans = 0
    for i in range(len(s)):
        for j in range(i+1, len(s)):
            c = Counter(s[i:j+1])
            c = c.values()
            ans = max(ans, max(c) - min(c))
    return ans

In writing this post I realize that pruning might be the way to do it.

def maxFrequencyDiferencePruning(s):
    ans = 0
    for i in range(len(s)):
        for j in range(i+1+ans+1, len(s)):
            c = Counter(s[i:j+1])
            print(i, j, s[i:j+1], "c", c, end="")
            c = c.values()
            ans = max(ans, max(c) - min(c))
            print(" :: max so far", ans)
    return ans

The inner loop j will only look for substrings that could possibly be larger than the best answer so far. (ans+1)

So the window starts out small and gets more agressive as it finds better answers.

My initial approach was to use intervals to figure out the size of the window and then slide the window along to look for the differences.

I also made an excessive deque structure that would spit out the frequency difference.


from collections import Counter
from collections import deque
class FreqQ:
    def __init__(self, N):
        self.N = N
        self.q = deque()
        self.counter = Counter()
        self.maxc = ''

    def append(self, c) -> int:
        self.q.append(c)
        self.counter[c] += 1

        if len(self.q) > self.N:
            last = self.q.popleft()
            self.counter[last] -= 1
            if self.counter[last] == 0:
                del self.counter[last]

        max_ = 0
        min_ = len(self.q)
        for v in self.counter.values():
            max_ = max(max_, v)
            min_ = min(min_, v)

        diff = max_ - min_
        return diff

I'm not really sure where I was supposed to go with question 2. Maybe there's a correlation with count unique characters of all substrings or largest rectangle in a histogram.

I'd like to see what you guys can come up with.

EDIT: I'm including what is the current best solution I've found.

I heard the phrase "those who fail to learn from the past are doomed to repeat it." Then I realized what I can do with this problem.

At the start of each range, clear the characters, add the characters for the current window back into the counter (I think this could be improved)
Iterate over window sizes that are big enough to find new maximums
Add only one new character each time
Check the current maximum character count to see if it might be bigger
memoization to remember visited (max, min) character counts. We don't care which characters they are at all. Only the frequencies.

This is over 90% faster than all the other algorithms posted so far. Why is it faster? You avoid visited states and prefer comparisons over subtractions.

def maxFrequencyDiferencePruningMemo(s):
    ans = 0
    memo = set()
    c = defaultdict(int)
    for i in range(len(s)):

        c.clear()
        for v in range(i, min(len(s), i+ans+1)):
            c[s[v]] += 1

        for j in range(i + 1 + ans + 1, len(s)):
            c[s[j]] += 1
            max_ = max(c.values())
            if max_ < ans+1:
                continue
            min_ = min(c.values())
            h = (max_, min_)
            if h in memo:
                continue
            memo.add(h)
            ans = max(ans, max_ - min_)

    return ans

Comments (11)