Skip to content

Instantly share code, notes, and snippets.

@statgeek
Last active December 12, 2018 20:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save statgeek/14e3aa2a9f718f551cd98134e9ceed30 to your computer and use it in GitHub Desktop.
Save statgeek/14e3aa2a9f718f551cd98134e9ceed30 to your computer and use it in GitHub Desktop.
SubGraph Macro
/*
The SubGraphs macro
Author: PGSTATS
This SAS macro finds the disjoint subgraphs in a graph described by a set of arcs.
The input dataset requires only two variables of the same length and type, character
or numeric. Each observation describes an arc in the graph. The output dataset will
contain two variables: Node will be similar in size and type to the input variables
and Clust will be a number identifying the clusters.
Macro arguments
arcs : input dataset name
from : first arc variable (optional, default=from)
to : second arc variable (optional, default=to)
out : output dataset name (optional, default=Clusters)
exp : power of two exponent of internal hash size (optional, default=8)
Tested with SAS version 9.3
PGStats
EDIT: A newer version is available here:
https://communities.sas.com/t5/SAS-Communities-Library/SAS-macro-for-finding-all-paths-in-a-directed-graph-network/ta-p/221568
*/
/* Example : From a list of American city pairs separated by less than 800 miles,
we identify two regions where one could get from one city to the next without
ever travelling more than 800 miles. */
/*Example data view: convert city distance matrix to city pairs list.
Keep nearby cities (<800 miles apart) only. */
/*
data Nearby(keep=from to Miles) / view=Nearby;
set Sashelp.Mileages;
length from to $15;
array c{*} _NUMERIC_;
from = upcase(compress(City,". -"));
do i = 1 to dim(c);
if c{i} > 0 and c{i} < 800 then do;
to = upcase(vname(c{i}));
Miles = c{i};
output;
end;
end;
run;
*/
/* Macro definition */
%macro SubGraphs(arcs,from=from,to=to,out=Clusters,exp=8);
data _null_;
if 0 then set &arcs(keep=&from rename=(&from=node)); /* get node data type */
length clust 8;
declare hash nodes(hashexp:&exp);
nodes.defineKey('node');
nodes.defineData('node', 'clust');
nodes.defineDone();
declare hiter nodeList('nodes');
do newClust = 1 by 1 while(not endLoop);
set &arcs end=endLoop;
call missing(clust); node = &from;
if 0^=nodes.find() then nodes.add();
fromClust = clust;
call missing(clust); node = &to;
if 0^=nodes.find() then nodes.add();
toClust = clust;
if n(fromClust, toClust) = 0 then do;
nodes.replace(key:&from, data:&from, data:newClust);
nodes.replace(key:&to, data:&to, data:newClust);
end;
else if missing(toClust) then
nodes.replace(key:&to, data:&to, data:fromClust);
else if missing(fromClust) then
nodes.replace(key:&from, data:&from, data:toClust);
else if fromClust ne toClust then do;
rc = nodeList.first();
do while (rc = 0);
if clust = fromClust then
nodes.replace(key:node, data:node, data:toClust);
rc = nodeList.next();
end;
end;
end;
nodes.output(dataset:"&out");
stop;
run;
%mend SubGraphs;
/* Example cont'd. Call the macro with the city pairs list dataview
and print the results. */
/*
options mprint symbolgen;
%SubGraphs(Nearby,out=Regions,exp=4);
proc sql;
select clust as Region, node as City
from Regions order by clust, node;
quit;
*/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment