hadoop - Pig SUM a column until it reaches a certain value and return the rows -


can me how calculate sum of coloumn until reaches value. usecase: top product produced 50% of revenue.

is there library piggybank done, couldn't find in piggybank.

i trying implement udf worried way :(.

here data structure looks like-

productid, totalprofitbyproduct, totalprofitbycompany, totalrevenueofcompany.

data in descending order on totalprofitbyproduct. totalprofitbycompany, totalrevenueofcompany remains same every row.

now want apply sum on totalprofitbyproduct each product above top , top products generated greater 50% of totalprofitbycompany or totalrevenueofcompany

piggybank has percentile udf , can used requirement .

pig script along udf can achieve .