Created
March 7, 2017 03:57
-
-
Save Cesar-Urteaga/690448f19457952114663b1696dd3c56 to your computer and use it in GitHub Desktop.
Split a dataset into smaller tables.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/*-----------------------------------------------------------------------------| | |
| Description : Macro that split a dataset into smaller datasets. | | |
| Assumptions : This macro is based on the paper "Splitting a Large SAS Data | | |
| Set" by John R. Gerlach and Simant Misra. | | |
| Parameters : InputDataset - Table to be split. It can include the | | |
| library. | | |
| NumberOfDatasets - Corresponds to the number of split tables. | | |
| OutputDatasets - Prefix of the split tables. It can include | | |
| the library. | | |
| Output : A set of split tables created from the input table. | | |
|-----------------------------------------------------------------------------*/ | |
/* Example: | |
* Creates the file(s) and the table(s) so as to execute the examples. ; | |
DATA DUMMY_TABLE; | |
DO I = 1 TO 48; | |
OUTPUT; | |
END; | |
RUN; | |
* Ex_01; | |
%MSplitDataset(InputDataset = DUMMY_TABLE); | |
* Output: Splits the DUMMY_TABLE into two tables (WORK._DS_001 and WORK._DS_002) | |
each one with 24 records. ; | |
*/ | |
/*-----------------------------------------------------------------------------| | |
| Date Author Description | | |
|------------------------------------------------------------------------------| | |
| March 06, 2016 Cesar R. Urteaga-Reyesvera Creation. | | |
|-----------------------------------------------------------------------------*/ | |
%MACRO MSplitDataset(InputDataset = /* Input table. It can include the | |
library. */, | |
NumberOfDatasets = 2 /* Number of tables in which input | |
table will be split. */, | |
OutputDatasets = WORK._DS_ /* Prefix of the split tables. | |
It can include the library. | |
*/ | |
); | |
DATA %DO _I = 1 %TO &NumberOfDatasets.; | |
&OutputDatasets.%SYSFUNC(PUTN(&_I, Z3.))(DROP = _SPLITPOINT) | |
%END; | |
; | |
SET &InputDataset. NOBS = _NOBS; | |
RETAIN _SPLITPOINT; | |
* We calculate the split-point. ; | |
IF _N_ = 1 THEN | |
_SPLITPOINT = INT(_NOBS / &NumberOfDatasets.) + | |
(MOD(_NOBS, &NumberOfDatasets.) ~= 0); | |
* Send each record to the designated dataset. ; | |
IF _N_ <= _SPLITPOINT THEN | |
OUTPUT &OutputDatasets.001; | |
%DO _I = 2 %TO &NumberOfDatasets.; | |
ELSE IF _N_ <= (&_I. * _SPLITPOINT) THEN | |
OUTPUT &OutputDatasets.%SYSFUNC(PUTN(&_I, Z3.)); | |
%END; | |
RUN; | |
%MEND MSplitDataset; |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment