Aggregation functions in PIG
Example1:
year | price |
2015 | 60 |
2014 | 45 |
2016 | 34 |
2014 | 75 |
2015 | 45 |
2014 | 41 |
grunt> file = load '/tmp/yearprice.txt' using PigStorage(',') as (year:int,price:float);
grunt> data = foreach file generate year, price by $0 > 1;
grunt> group = group data by year;
grunt> sum = foreach group generate group as year, SUM(data.price);
grunt> dump sum;
2014 161
2015 105
2016 34
Example 2:
id1, 1,on,400
id1, 2,off,100
id2, 3,on,200
i would like to get the result as "sum of $3 if $2 is 0, by ID $0"
grunt> file = load '/tmp/file' using PigStorage(',');
grunt> refineData = foreach file generate $0, $1, (($2 == 'on') ? $3 : 0);
grunt> grp = group refineData by $0;
grunt> sum = foreach grp generate group as id, SUM(refineData.$2);
grunt> dump sum;
id1, 500
id2, 200
grunt> file = load '/tmp/file' using PigStorage(',');
grunt> refineData = foreach file generate $0, $1, (($2 == 'on') ? $3 : 0);
grunt> grp = group refineData by $0;
grunt> sum = foreach grp generate group as id, SUM(refineData.$2);
grunt> dump sum;
id1, 500
id2, 200
0 comments:
Post a Comment