About

Edit photo

Monday, November 28, 2016

Aggregation functions in PIG



Example1:
year price
2015 60
2014 45
2016 34
2014 75
2015 45
2014 41
I would like to get the result as, total price of the item by year.

grunt> file = load '/tmp/yearprice.txt' using PigStorage(',') as (year:int,price:float);

grunt> data = foreach file generate year, price by $0 > 1;

grunt> group = group data by year;

grunt> sum = foreach group generate group as year, SUM(data.price);

grunt> dump sum;

2014 161
2015 105
2016 34


Example 2:

id1, 1,on,400 

id1, 2,off,100

id2, 3,on,200
i would like to get the result as "sum of $3 if $2 is 0, by ID $0"

grunt> file = load '/tmp/file' using PigStorage(',');

grunt> refineData = foreach file generate $0, $1, (($2 == 'on') ? $3 : 0);

grunt> grp = group refineData by $0;

grunt> sum = foreach grp generate group as id, SUM(refineData.$2);

grunt> dump sum;

id1, 500
id2, 200

0 comments:

Post a Comment