Hi! I wonder how to optimize the bucket sort when we want to sort an array with numbers T and second array P which include information like interval [a,b] and the probability of that the nubmer from given interval are in the T array.

For exmaple.
T = [6.1, 1.2, 1.5, 3.5, 4.5, 2.5, 3.9, 7.8]
P = [(1, 5, 0.75) , (4, 8, 0.25)]

Result
T = [1.2, 1.5, 2.5, 3.5, 3.9, 4.5, 6.1, 7.8]

THE P ARRAY ARE NOT SORTED

I tried to implement that but when i do something more then just regullary create a buckets I get something much much slower.

The key is to take advantage of the given intervals and probability. On wikipedia I read that we wnat to make bucket based on density.
So here are a formula that I came up with but it doens't work for me.

k = len(P)
    boundries = []

    for el in P:
        # el = (a,b,c)
        a = el[0]
        b = el[1]
        c = el[2]
        boundries.append(a)
        start = a
        for i in range(a,b + 1):
            end = (b - a) / (k * c) + start
            B.append([])
            boundries.append(end)
            start = end

I don't know how to create the buckets and then how to get appropriate index of bucket while sorting the T array

Comments (0)