can me how calculate sum of coloumn until reaches value. usecase: top product produced 50% of revenue.
is there library piggybank done, couldn't find in piggybank.
i trying implement udf worried way :(.
here data structure looks like-
productid, totalprofitbyproduct, totalprofitbycompany, totalrevenueofcompany.
data in descending order on totalprofitbyproduct. totalprofitbycompany, totalrevenueofcompany remains same every row.
now want apply sum on totalprofitbyproduct each product above top , top products generated greater 50% of totalprofitbycompany or totalrevenueofcompany
piggybank has percentile udf , can used requirement .
pig script along udf can achieve .