Using Macros to run repetitive analyses or processes

How many times have you run an analysis where you need to go back and rerun it with a new variable?  Or do you have an Excel spreadsheet with several worksheets that you need to import? I like to write quick little macros that help me get around editing my program over and over OR copying and pasting a lot of code.

Let’s start with an Excel spreadsheet with temperature data for the months of January through to August, with each month on a separate worksheet.

excel_data

My first step would be to import the January worksheet.  By using the File -> Import option, I’m going to save the code that SAS creates.  It will look something like this:

PROC IMPORT OUT= WORK.January
DATAFILE= “C:\Guelph_weather_201301_08.xlsx”
DBMS=EXCEL REPLACE;
RANGE=”January$”;
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;

Things to take note of:

  1. The name of the worksheet that you are importing – it is in the RANGE= statement.
  2. Also note when I imported the worksheet I called it January – WORK.January

To keep things more organized, at least for me, I called each worksheet in my Excel file by the name of the month and when I import the sheet I call it by the name of the month as well.

Quickly review your datafile

Proc print data=January;
Run;

Let’s just run a few basic analyses on the contents of this datafile:

/* Means of the maximum and mean temperatures that were observed during the month of January  */

Proc means data=January;
var mean_temp max_temp;
Run;

/*  Correlations between mean and maximum temperatures for the month of January */

Proc corr data=January;
var mean_temp max_temp;
Run;

/* Regression analysis to predict the mean temperature from maximum temperature for January  */

Proc reg data=January;
model mean_temp = max_temp;
Run;
Quit;

All very straight forward analyses.  Now remember that our Excel datafile contains data for January, February, March, April, May, June, July, and August.  I can very easily go back and edit the above coding and change January to February and if I’m interested in predicting the mean temperature in February with the minimum temperature – that’s another easy edit.  Or if I want to capture every step of my analyses, I may copy and paste the above code and edit for the appropriate month and variables.

Very easy but it can get messy – so let’s try writing a macro around this code.

First step is to create a macro and give it a name – remember SAS naming conventions will come to play here.  Cannot start with a number or funny character and it cannot be a SAS command.  For our example let’s call it weather.

%macro weather

Now we need to think about what we need to change in our coding, the name of the month is one item – so we need to add this to our macro statement.

%macro weather(cal_month

Remember we cannot use the word “month” because it is used by SAS, so I’ve added the “cal_” in front of it – you add whatever makes sense to you.

Is there anything else we need to change coding for?  Yes, the names of our variables – remember we may want to change which variables we are studying.  We list these, close our bracket and remember that semicolon.

%macro weather(cal_month, var1, var2);

Now we add the coding we want to use repetitively:

%macro weather(cal_month, var1, var2);

PROC IMPORT OUT= WORK.January
DATAFILE= “C:\Users\edwardsm\Documents\Michelle_Docs\Workshops\SAS_Noon_Hour\Macros
\Guelph_weather_201301_08.xlsx”
DBMS=EXCEL REPLACE;
RANGE=”January$”;
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;

Proc means data=January;
var mean_temp max_temp;
Run;

Proc corr data=January;
var mean_temp max_temp;
Run;

Proc reg data=January;
model mean_temp = max_temp;
Run;
Quit;

%mend;

To finish the macro coding you need to an %mend at the end.

Ah but we’re not done yet!!!

Now we need to go back into the coding and replace the word January with a macro parameter – in other words, remember how we used cal_month after our macro name?  Well now you need to replace every occurrence of “January” with “&cal_month” .  For example:

PROC IMPORT OUT= WORK.January
DATAFILE= “C:\Users\edwardsm\Documents\Michelle_Docs\Workshops\SAS_Noon_Hour\Macros
\Guelph_weather_201301_08.xlsx”
DBMS=EXCEL REPLACE;
RANGE=”January$”;
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;

should now look like:

PROC IMPORT OUT= WORK.&cal_month
DATAFILE= “C:\Users\edwardsm\Documents\Michelle_Docs\Workshops\SAS_Noon_Hour\Macros
\Guelph_weather_201301_08.xlsx”
DBMS=EXCEL REPLACE;
RANGE=”&cal_month$”;
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;

You’ll need to do the same thing for mean_temp and max_temp.  Make sure that you use the same &var1 for mean_temp and &var2 for max_temp.

Your final coding should look like:

%macro weather(cal_month, var1, var2);
PROC IMPORT OUT= WORK.&cal_month
DATAFILE= “C:\Users\edwardsm\Documents\Michelle_Docs\Workshops\SAS_Noon_Hour\Macros\Guelph_weather_201301_08.xlsx”
DBMS=EXCEL REPLACE;
RANGE=”&cal_month$”;
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;

title “Results for the month of &cal_month 2013″;

Proc print data=&cal_month;
Run;

Proc means data=&cal_month;
var &var1 &var2;
Run;

Proc corr data=&cal_month;
var &var1 &var2;
Run;

Proc reg data=&cal_month;
model &var1 = &var2;
Run;
Quit;
%mend;

Now that’s the macro coding – but we haven’t run it yet.  That’s our next step.

Now let’s say we have already this analysis for the month of January and now we want to run it for February.  All we need to do is to invoke the macro by calling it up and specifying cal_month, var1 and var2.

%weather(February, mean_temp, max_temp);

SAS will replace every occurrence of &cal_month with February – that means it will import the worksheet labelled February and create a SAS dataset called February.

SAS will replace every occurrence of &var1 with mean_temp and every occurrence of &var2 with max_temp.

To run all the months – you would simply use:

%weather(January, mean_temp, max_temp);
%weather(February, mean_temp, max_temp);
%weather(March, mean_temp, max_temp);
%weather(April, mean_temp, max_temp);
%weather(May, mean_temp, min_temp);
%weather(June, mean_temp, max_temp);
%weather(July, mean_temp, max_temp);
%weather(August, mean_temp, min_temp);

Please note that in the final SAS coding I added a title statement with the &cal_month to help me when I review all the output when I’m done for the day.

Screen Shot 2013-11-18 at 7.33.07 PM