Hi! I wonder how to optimize the bucket sort when we want to sort an array with numbers T and second array P which include information like interval [a,b] and the probability of that the nubmer from given interval are in the T array.
For exmaple.
T = [6.1, 1.2, 1.5, 3.5, 4.5, 2.5, 3.9, 7.8]
P = [(1, 5, 0.75) , (4, 8, 0.25)]
Result
T = [1.2, 1.5, 2.5, 3.5, 3.9, 4.5, 6.1, 7.8]
THE P ARRAY ARE NOT SORTED
I tried to implement that but when i do something more then just regullary create a buckets I get something much much slower.
The key is to take advantage of the given intervals and probability. On wikipedia I read that we wnat to make bucket based on density.
So here are a formula that I came up with but it doens't work for me.
k = len(P)
boundries = []
for el in P:
# el = (a,b,c)
a = el[0]
b = el[1]
c = el[2]
boundries.append(a)
start = a
for i in range(a,b + 1):
end = (b - a) / (k * c) + start
B.append([])
boundries.append(end)
start = endI don't know how to create the buckets and then how to get appropriate index of bucket while sorting the T array