统计211

标题: SAS中分类汇总的操作 [打印本页]

作者: 275932488    时间: 2011-7-1 13:49
标题: SAS中分类汇总的操作
       在处理数据时,经常遇到的情况是需要将数据按照某一个或者某一些变量进行分类汇总,一般情况下,针对人/企业和时间的居多。这时我们就要用到SAS中的first和last关键字。习惯上,我们称这种分类汇总为Roll Up。举个简单例子,假设我们有一个数据集,这个数据集是trip level的,描述了读者在某个书吧的消费信息,它包含reader_id,year,month,day, time,music_fee,book_fee,… …等字段(music_fee为点歌花费,book_fee为阅读花费)。现在我们需要对这个trip level的数据集进行按照reader_id的Roll Up,输出的结果集将是reader level的,并且针对reader我们要统计其每个月的各种消费。
  options errors=4 compress=yes;
  libname src = "";
  libname tar = "";
  %let var_num = 2;
  %let start_y = mdy();
  %let end_y = &sysdate;
  %let start_m = mdy();
  %let end_m = &sysdate;
  %let year_num = &end_y - &start_y;
  %let month_num = &year_num*12 - &start_m - (12 - &end_m - 1);
  %let var_1 = music_fee;
  %let var_2 = book_fee;
  %macro array_create;
  %do i=1 %to &var_num;
  array a&&var_&i (*) a&&var_&i.._1 - a&&var_&i.._&var_num;
  retain a&&var_&i.._1 - a&&var_&i.._&var_num;
  %end;
  %mend;
  %macro array_init;
  %do i=1 %to &var_num;
  %do j=1 %to &month_num;
  a&&var_&i(&j) = 0;
  %end;
  %end;
  %mend;
  %macro array_set_value;
  %do i=0 %to &year_num - 1;
  = &start_y+&i and Year < &end_y %then;
  %do j=1 %to &var_num;
  /* sum function can ignore missing value, if use a=a+b and b is missing, a will be missing. So sum function is a good choice.*/
  a&&var_&j((Year-&start_y)*12+Month) =
  sum(a&&var_&j((Year-&start_y)*12+Month), &&var_&j);
  %end;
  %end;
  %end;
  %if Year = &end_y and Month < &end_m %then;
  %do j=1 %to &var_num;
  a&&var_&j((Year-&start_y)*12+Month) =
  sum(a&&var_&j((Year-&start_y)*12+Month), &&var_&j);
  %end;
  %end;
  %mend;
  %macro drop_var;
  %do i=1 %to &var_num;
  drop &&var_&i;
  %end;
  %mend
  proc sort data=src.trip_lvl out=tar.trip_lvl_sort;
  by reader_id;
  run;
  /* Roll Up */
  data tar.reader_lvl (drop = Year Month);
  set tar.trip_lvl_sort;
  by reader_id;
  %array_create;
  if first.reader_id then do;
  %array_init;
  end;
  %array_set_value;
  if last.reader_id then do;
  %drop_var;
  end;
  /*Only keep the last observation for each reader because it has completed statistical information*/
  if last.reader_id;
  run;





欢迎光临 统计211 (http://www.tj211.com/) Powered by Discuz! X3.2