Skip to content

Instantly share code, notes, and snippets.

@sdwfrost
Last active August 29, 2015 14:02
Show Gist options
  • Save sdwfrost/1bb5f41d4dac317be22c to your computer and use it in GitHub Desktop.
Save sdwfrost/1bb5f41d4dac317be22c to your computer and use it in GitHub Desktop.
Processing RDS data
ID netsize mycoupon coupon1 coupon2 coupon3 coupon4 coupon5 coupon6 coupon7 coupon8 gender.mf age airplay.yn
1 350 0 14250004 14250005 14250006 14256002 0 0 0 901 1 40 1
2 0 0 14250007 14250008 14250009 14256003 0 0 912 902 1 64 1
3 585 0 14250010 14250011 14250012 14256004 0 0 0 903 2 41 1
4 400 0 14250025 14250026 14250027 14256009 0 0 0 904 2 77 0
5 150 0 14250022 14250023 14250023 14256008 0 0 0 905 1 33 1
6 100 0 14250028 14250029 14250030 14256010 0 0 916 906 1 31 2
7 300 0 14250016 14250017 14250018 14256006 0 0 0 907 1 70 1
8 700 0 14250040 14250041 14250042 14256014 0 0 0 908 1 49 1
9 300 14256002 14250013 14250014 14250015 14256005 0 0 0 909 2 38 1
10 200 14250013 14250019 14250020 14250021 14256007 0 0 0 9010 1 37 1
11 200 14250005 14250031 14250032 14250033 14256011 0 0 0 9011 1 50 1
12 300 14250004 14250034 14250035 14250036 14256012 0 0 0 9012 1 41 1
13 100 14250012 14250103 14250102 14250101 14256101 0 0 0 9013 2 51 1
14 383 14250026 14250037 14250038 14250039 14256013 0 0 0 9014 2 46 1
15 700 14256007 14250043 14250044 14250045 14256015 0 0 0 9015 2 101 2
16 80 14250043 14250046 14250047 14250048 14256017 0 0 0 9016 2 50 1
17 300 14250033 14250104 14250105 14250106 14256016 0 0 0 9017 1 41 1
18 200 0 14250049 14250050 14250051 14256018 0 0 9118 9018 1 35 1
19 100 14250010 14250116 14250117 14250118 14256020 0 0 0 9019 1 34 1
20 400 14250029 14250107 14250108 14250109 14256102 0 0 0 9020 1 42 1
21 200 14250105 14250111 14250112 14250110 14256103 0 0 0 9021 1 33 2
22 150 14250037 14250113 14250114 14250115 14256019 0 0 0 9022 1 50 1
23 100 14250030 14250052 14250053 14250054 14256021 0 0 0 9023 1 34 1
24 150 14256004 14250057 14250056 14250055 14256022 0 0 0 9024 2 60 2
25 200 14250054 14250058 14250059 14250060 14256023 0 0 0 9025 1 34 1
26 50 14250053 14250061 14250062 14250063 14256024 0 0 0 9026 1 45 2
27 300 14250025 14250064 14250065 14250066 14256025 0 0 0 9027 2 40 1
28 850 14256009 14250122 14250123 14250124 14256027 0 0 0 9028 2 59 1
29 0 14256003 0 0 0 0 0 0 0 9029 2 0 1
30 500 14250066 14250067 14250068 14250069 900 0 0 0 9030 1 54 1
31 30 14250113 14250070 14250071 14250072 14256028 0 0 0 9031 1 48 1
32 200 14256025 14250125 14250126 14250127 14256034 0 0 0 9032 2 41 2
33 300 14256018 14250089 14250090 14250091 14256035 0 0 0 9033 2 46 1
34 500 14250059 14250098 14250099 14250100 14256038 0 0 0 9034 1 31 1
35 500 14250098 14250095 14250096 14250097 14256037 0 0 0 9035 1 52 1
36 25 14250049 14250133 14250134 14250135 14250136 14256040 0 0 9036 1 29 1
37 150 14250051 14250137 14250138 14250139 14250140 14256041 0 9137 9037 1 34 1
38 50 14250050 14250166 14250167 14250168 14250169 14256047 0 0 9038 1 42 1
39 500 14250089 14250162 14250163 14250164 14250165 14256046 0 0 9039 1 43 1
40 150 14256011 14250141 14250142 14250143 14250144 14256042 0 0 9040 2 31 1
41 150 14250007 14250145 14250146 14250147 14250148 14256043 0 0 9041 1 58 1
42 300 14250028 14250154 14250155 14250156 14250157 14256045 0 0 9042 1 38 1
43 300 14256043 14250150 14250151 14250152 14250153 14256044 0 0 9043 2 61 1
44 300 14256022 14250170 14250171 14250172 14250173 14256048 0 0 9044 2 48 1
45 0 902 14250174 14250175 14250176 14250177 14256049 0 0 9045 1 80 1
46 200 14250039 14250203 14250204 14250205 14250206 14256104 0 0 9046 2 53 1
47 250 14256010 14250201 14250202 14250207 14250208 14250209 0 0 9047 1 36 1
48 400 14250162 14250210 14250211 14250212 14250213 14256105 0 0 9048 1 59 1
49 150 14256013 14250178 14250179 14250180 14250181 14256106 0 0 9049 2 44 1
50 100 14250038 14250182 14250183 14250184 14250185 14256107 14250401 9150 9050 2 54 1
51 0 14250206 14250186 14250187 14250188 14250189 14256050 0 0 9051 1 59 1
52 250 14250211 14250190 14250191 14250192 14250193 14256051 0 0 9052 1 64 1
53 100 0 14250194 14250195 14250196 14250197 14256052 0 0 9053 1 49 1
54 200 14250210 14250214 14250215 14250216 14250217 14256053 0 0 9054 1 50 1
55 0 14250194 14250218 14250219 14250220 14250221 14256055 0 0 9055 1 52 1
56 100 14250214 14250222 14250223 14250224 14250225 14256056 0 0 9056 1 41 1
57 150 14250183 14250226 14250227 14250228 14250229 14256057 0 0 9057 1 70 1
58 150 14250182 14250230 14250231 14250232 14250233 14256058 0 0 9058 1 29 1
59 150 14256107 14250234 14250235 14250236 14250237 14256059 0 0 9059 2 40 1
60 600 9050 14250242 14250243 14250244 14250245 14256061 0 0 9060 2 48 2
61 60 14250185 14250246 14250247 14250248 14256062 0 0 0 9061 1 49 1
62 300 14250184 14250250 14250251 14250252 14250253 14256063 0 0 9062 2 46 1
63 50 14256052 14250254 14250255 14250256 14250257 14256064 0 0 9063 2 56 1
64 500 14250401 14250258 14250259 14250260 14250261 14256065 0 0 9064 1 53 2
65 300 14256065 14250262 14250263 14250264 14250265 14256066 0 0 9065 2 39 2
66 100 14250205 14250266 14250267 14250268 14250269 14256067 0 0 9066 2 55 1
67 700 14250065 14250270 14250271 14250272 14250273 14256068 0 0 9067 2 61 1
68 100 14250138 14250274 14250275 14250276 14250277 14256069 0 0 9068 1 35 1
69 300 9037 14250278 14250279 14250280 14250281 14256070 0 0 9069 2 27 1
70 200 14250243 14250282 14250283 14250284 14250285 14256071 0 0 9070 1 49 1
71 200 9061 14250290 14250291 14250292 14256073 0 0 0 9071 1 55 1
72 100 14250253 14250294 14250295 14250296 14250297 14256074 0 0 9072 2 63 1
73 150 9071 14250298 14250299 14250300 14250301 1425075 0 0 9073 1 58 1
74 20 14250301 14250302 14250303 14250304 14250305 14256076 0 0 9074 1 47 1
75 0 14250297 14250306 14250307 14250308 14250309 14256077 0 0 9075 1 54 1
76 200 14250305 14250310 14250311 14250312 14250313 14256078 0 0 9076 1 30 1
77 500 14250309 14250314 14250315 14250316 14250317 14256079 0 0 9077 1 54 1
78 100 14250310 14250318 14250319 14250320 14250321 14256080 0 0 9078 1 59 1
79 35 14256104 14250322 14250324 14250325 14256081 0 0 0 9079 2 32 2
80 700 14250197 14250326 14250327 14250328 14250329 14256082 0 0 9080 1 64 1
81 70 14256056 14250286 14250287 14250288 14250289 14256072 0 9181 9081 2 43 1
82 300 14250107 14250330 14250331 14250332 14250333 14256083 0 0 9082 1 46 1
83 200 14250317 14250334 14250335 14250336 14250337 14256084 0 0 9083 1 60 1
84 500 14250258 14250338 14250339 14250340 14250341 14256085 0 0 9084 1 44 1
85 125 14256064 14250342 14250343 14250345 14250346 142560347 0 0 9085 1 57 1
86 500 14250304 14250355 14250356 14250367 14250378 14256088 0 0 9086 1 61 2
87 500 14250234 14250347 14250348 14250349 14250350 14256086 0 0 9087 2 32 1
88 250 9137 14250359 14250360 14250361 14250362 14256089 0 0 9088 1 30 1
89 100 14250064 14250351 14250352 14250353 14250354 14256087 0 0 9089 1 47 2
90 200 14250282 14250375 14250376 14250377 14250378 14256093 0 0 9090 1 42 1
91 75 14250262 14250379 14250380 14250381 14250382 14256094 0 0 9091 0 44 1
92 30 9070 14250383 14250384 14250385 14250386 14256095 0 0 9092 1 30 1
93 0 906 14250387 14250388 14250389 14250390 14256096 0 0 9093 1 73 1
94 150 14250264 14250391 14250392 14250393 14250394 14256097 0 0 9094 1 32 1
95 150 14250209 14250395 14250396 14250397 14250398 14256098 0 0 9095 2 31 1
96 200 14250259 14250404 14250405 14250406 14250407 14256099 0 0 9096 2 30 1
97 400 9018 14250412 14250413 14250414 14250415 14256109 0 0 9097 1 39 1
98 0 9150 14250416 14250417 14250418 14250419 14256110 0 0 9098 1 59 1
99 0 14250267 14250420 14250421 14250422 14250423 14256111 0 9199 9099 2 76 2
100 100 912 14250424 14250425 14250426 14250427 14256112 0 0 90100 1 65 1
101 0 14250422 14250428 14250429 14250430 14250431 14256113 0 0 90101 1 59 1
102 100 14250318 14250437 14250436 14250438 14250349 14256115 0 0 90102 1 45 1
103 100 14250436 14250432 14250433 14250434 14250435 14256114 0 0 90103 1 55 1
104 200 14250256 14250440 14250441 14250442 14250443 14256116 0 0 90104 1 45 1
105 20 14250428 14250501 14250502 14250503 14250504 14256301 0 0 90105 2 47 2
106 100 14250166 14250505 14250506 14250507 14250508 14250509 0 0 90106 1 40 1
107 250 14250261 14250511 14250512 14250513 14250514 14256201 0 0 90107 1 43 1
108 250 14250423 14250444 14250445 14250446 14250447 14256117 0 0 90108 2 63 1
109 50 14250388 14250448 14250449 14250450 14250451 14256118 0 0 90109 1 53 1
110 60 14250387 14250452 14250453 14250454 14250455 14256119 0 0 90110 1 32 2
111 100 916 14250456 14250457 14250458 14250459 14256120 0 0 90111 2 25 2
112 500 14256112 14250511 900 14250513 900 14256201 0 0 90112 2 49 1
113 100 14250390 14250469 14250470 14250471 14250472 14256123 0 0 90113 1 25 2
114 150 14250432 14250539 14250540 14250541 14250542 14256208 0 0 90114 1 54 1
115 500 9081 14250535 14250536 14250537 14250538 14256207 0 0 90115 1 52 1
116 200 14250389 14250477 14250478 14250479 14250480 14256124 0 0 90116 1 25 1
117 200 90112 14250550 14250551 14250552 14250553 14256126 0 0 90117 1 54 2
118 100 9038 14250554 14250555 14250556 14250557 14256127 0 0 90118 1 50 0
119 20 9099 14250481 14250482 14250483 14250484 14256128 0 0 90119 2 51 2
120 150 14250278 14250558 14250559 14250560 14250561 14256129 0 0 90120 2 55 1
121 100 14250505 14250562 14250563 14250564 14250565 14256130 0 0 90121 1 30 2
122 90 14250181 14250566 14250567 14250568 14250569 14256131 0 0 90122 2 55 2
123 0 14250168 14250570 14250571 14250572 14250573 14256132 0 0 90123 1 50 1
124 100 9118 14250519 14250520 14250521 14250522 14256203 0 0 90124 1 33 1
125 150 14250108 14250574 14250575 14250576 14250577 14256134 0 0 90125 1 23 2
126 200 14250444 14250578 14250579 14250580 14250581 14256135 0 0 90126 2 58 1
127 0 14250294 14250489 14250490 14250491 14250492 14256209 0 0 90127 1 73 1
128 20 14250056 14250497 14250498 14250499 14250500 14256211 0 0 90128 2 29 2
129 70 14250157 14250461 14250462 14250463 143000000 14256121 0 0 90129 1 48 1
130 400 9055 14250586 14250587 14250588 14250589 0 0 0 90130 1 31 2
131 25 14250420 14250582 14250583 14250584 14250585 14256136 0 0 90131 1 54 2
132 50 14250322 14250531 14250532 14250533 14250534 14256206 0 0 90132 1 45 1
133 200 14250580 14250546 14250547 14250548 14250549 14256125 0 0 90133 1 55 1
134 300 9034 14250590 14250591 14250592 14250593 14256138 0 0 90134 1 38 1
135 250 14250594 14250598 14250599 14250600 14250601 14256140 0 0 90135 1 24 1
136 50 14250595 14250602 14250603 14250604 14250605 14256141 0 0 90136 1 21 2
137 200 14250562 14250594 14250595 14250596 14250597 14250139 0 0 90137 1 25 1
138 100 14250550 14250606 14250607 14250608 14250609 14256212 0 0 90138 1 66 1
139 500 14250587 14250363 14250364 14250365 14250366 14256090 0 0 90139 1 42 1
140 150 14250338 14250610 14250611 14250612 14250613 14256213 0 0 90140 2 65 2
141 150 14250109 14250614 14250615 14250616 14250617 14256214 0 0 90141 1 44 1
142 600 14250551 14250622 14250623 14250624 14250625 14256216 0 0 90142 1 47 1
143 0 14250339 14250626 14250627 14250628 14250629 14256217 0 0 90143 2 29 1
144 100 14250506 14250618 14250619 14250620 14250621 14256215 0 0 90144 1 31 1
145 30 14250563 14250634 14250635 14250636 14250367 14256219 0 0 90145 2 26 2
146 500 9181 14250523 14250524 14250525 14250526 14256204 0 0 90146 1 39 1
147 150 14250564 14250630 14250631 14250632 14250633 14256218 0 0 90147 1 22 1
148 200 14250308 14250638 14250639 14250640 14250641 14256302 0 0 90148 1 47 1
149 300 14256094 14250642 14250643 14250644 14250645 14256303 0 0 90149 1 38 2
150 200 14250445 14250646 14250647 14250648 14250649 14256304 0 0 90150 1 76 1
151 150 14250307 14250650 14250651 14250652 14250653 14256305 0 0 90151 1 47 2
152 300 14250446 14250654 14250655 14250656 14250657 14256306 0 0 90152 1 49 1
153 100 90141 14250658 14250659 14250660 14250661 14256307 0 0 90153 1 41 1
154 500 14250195 14250663 14250664 14250665 14250666 14256308 0 0 90154 1 59 1
155 200 14250581 14250667 14250668 14250669 14250670 14256309 0 0 90155 1 64 1
156 200 14250507 14250371 14250672 14250673 14250674 14256310 0 0 90156 1 48 1
157 100 14256110 14250675 14250676 14250677 14250678 14256311 0 0 90157 2 66 1
158 100 14250490 14250679 14250680 14250681 14250682 14256312 0 0 90158 1 64 2
159 200 14250613 14250683 14250684 14250685 14250686 14256313 0 0 90159 2 64 2
160 125 14250571 14250687 14250688 14250689 14250690 14256314 0 0 90160 1 44 1
161 50 14250572 14250691 14250692 14250693 14250694 14256315 0 0 90161 1 51 1
162 500 14250514 14250699 14250700 14250701 14250702 14256371 0 0 90162 1 50 1
163 50 14250133 14250703 14250704 14250705 14250706 14256318 0 0 90163 1 35 1
164 200 9199 14250707 14250708 14250709 14250710 14256319 0 0 90164 1 54 2
165 500 14250663 14250695 14250696 14250697 14250698 14256316 0 0 90165 1 60 0
166 500 14250196 14250715 14250716 14250717 14250718 14256321 0 0 90166 1 74 1
167 100 14256313 14250719 14250720 14250721 14250722 14256322 0 0 90167 2 51 2
168 300 90160 14250723 14250724 14250725 14250726 14256323 0 0 90168 1 62 1
169 200 14250578 14250727 14250728 14250729 14250730 14256324 0 0 90169 1 48 1
170 100 14250221 14250731 14250732 14250733 14250734 14256325 0 0 90170 1 54 1
171 70 14250565 14250735 14250736 14250737 14250738 14256326 0 0 90171 1 31 1
172 300 14250615 14250739 14250740 14250741 14250742 14256327 0 0 90172 1 35 1
173 75 14250636 14250743 14250744 14250745 14250746 14256328 0 0 90173 1 29 1
174 300 14250356 14250408 14250409 14250410 14250411 14256100 0 0 90174 1 62 1
175 0 14250664 14250711 14250712 14250713 14250714 14256320 0 0 90175 1 52 0
176 500 14250658 14250751 14250752 14250753 14250751 14256330 0 0 90176 1 43 1
177 100 14256218 14250755 14250756 14250757 14250758 14256331 0 0 90177 1 20 1
178 100 14250610 14250759 14250760 14250761 14250762 14256332 0 0 90178 1 39 1
179 60 14250579 14250763 14250764 14250765 14250766 14256333 0 0 90179 2 39 1
180 125 14250169 14250767 14250768 14250769 14250770 14256334 0 0 90180 1 33 1
181 500 14250512 14250771 14250772 14250773 14250774 142565335 0 0 90181 1 44 2
182 0 14250725 14250775 14250776 14250777 14250778 14256336 0 0 90182 1 54 1
184 200 14256324 14250783 14250784 14250785 14250786 14256338 0 0 90184 1 48 1
185 200 14250561 14250787 14250788 14250789 14250790 14256339 0 0 90185 1 59 1
186 500 14250732 14250791 14250792 14250793 14250794 14256142 0 0 90186 1 52 1
187 600 0 14250795 14250796 14250797 14250798 14256163 0 0 90187 2 59 1
188 25 14250770 14250799 14250800 14250801 14250802 14256334 0 0 90188 1 36 1
189 80 14250730 14250807 14250808 14250809 14250810 14256146 0 0 90189 1 28 1
190 150 14250703 14250811 14250812 14250813 14250814 14256147 0 0 90190 1 29 1
191 120 0 14250815 14250816 14250817 14250818 14256147 0 0 90191 1 55 0
192 100 14256074 14250819 14250820 14250821 14250822 14256149 0 0 90192 2 44 1
193 150 14250570 14250823 14250824 14250826 14256150 14250825 0 0 90193 1 45 1
194 400 14250721 14250827 14250828 14250829 14250830 14256151 0 0 90194 0 42 1
195 60 14250683 14250831 14250832 14250833 14250834 14256152 0 0 90195 0 30 1
196 0 14250482 14250835 14250836 14250837 14250838 14256153 0 0 90196 1 50 2
197 500 14250774 14250843 14250844 14250845 14250846 14256155 0 0 90197 1 48 0
198 250 14250799 14250839 14250840 14250841 14250842 14256154 0 0 90198 1 55 1
199 0 14250155 14250960 14250961 14250962 14250963 14256221 0 0 90199 1 36 1
200 150 0 14250851 14250852 14250853 14250854 14256157 0 0 90200 1 28 1
201 25 14250694 14250847 14250848 14250849 14250850 14256156 0 0 90201 1 53 1
202 30 14250558 14250863 14250864 14250865 14250866 14256160 0 0 90202 1 34 1
203 50 14250783 14250867 14250868 14250869 14250870 14256161 0 0 90203 1 27 1
204 0 14250552 14250871 14250872 14250873 14250874 14256162 0 0 90204 2 49 0
205 30 14250810 14250875 14250876 14250877 14250878 14256163 0 0 90205 1 27 1
206 200 90155 14250880 14250881 14250882 14250883 14256164 0 0 90206 1 47 1
207 0 14250724 14250885 14250886 14250887 14250888 14256165 0 0 90207 1 69 1
208 30 14250871 14250888 14250889 14250890 14250891 14256166 0 0 90208 0 40 0
209 70 14250795 14250892 14250893 14250894 14250895 14256167 0 0 90209 1 29 2
210 0 14250729 14250900 14250901 14250902 14250903 14256192 0 0 90210 1 64 1
211 100 14250733 14250904 14250905 14250906 14250907 14256170 0 0 90211 1 63 1
212 100 14250295 14250908 14250909 14250910 14250911 14256171 0 0 90212 1 34 1
213 375 14250773 14250912 14250913 14250914 14250915 14256172 0 0 90213 2 40 1
214 50 14250892 14250916 14250917 14250918 14250919 14256345 0 0 90214 1 30 1
215 100 14250771 14250920 14250921 14250922 14250923 14256344 0 0 90215 1 42 1
216 100 14250900 14250952 14250953 14250954 14250955 14256180 0 0 90216 1 31 1
217 100 14250611 14250924 14250925 14250926 14250927 14256346 0 0 90217 1 62 1
218 50 14250838 14250896 14250897 14250898 14250899 14256174 0 0 90218 1 59 0
219 50 90210 14250859 14250860 14250861 14250862 14256159 0 0 90219 2 48 2
220 100 14250786 14250932 14250933 14250934 14250935 14256175 0 0 90220 1 49 1
221 400 14250923 14250515 14250516 14250517 14250518 14256202 0 0 90221 1 58 1
222 25 14250272 14250968 14250969 14250970 14250971 14256223 0 0 90222 1 55 1
223 200 14250296 14250855 14250856 14250856 14250858 14256158 0 0 90223 2 30 1
224 200 14250785 14250948 14250949 14250950 14250951 142560179 0 0 90224 1 38 1
225 800 14250920 14250957 14250958 14250959 14256220 14250956 0 0 90225 1 55 2
226 30 14250612 14250940 14250941 14250942 14250943 14256177 0 0 90226 1 35 1
227 400 14256211 14250936 14250937 14250938 14250939 14256176 0 0 90227 1 38 1
228 300 14250179 14250944 14250945 14250946 14250947 14256178 0 0 90228 2 51 1
229 100 90187 14250964 14250965 14250967 14256222 14250966 0 0 90229 1 47 2
230 100 14250560 14250972 14250973 14250974 14250975 14256224 0 0 90230 1 21 1
231 150 14250859 14250976 14250977 14250978 14250979 14256225 0 0 90231 1 68 1
232 40 14250948 14250980 14250981 14250982 14250983 14256223 0 0 90232 1 41 1
233 250 14250946 14251000 14251001 14251002 14251003 14256231 0 0 90233 1 44 2
234 300 14250492 14250984 14250985 14250986 14250987 14256227 0 0 90234 1 50 1
235 100 14250090 14250992 14250993 14250994 14250995 14256229 0 0 90235 2 53 1
236 100 14250178 14250996 14250997 14250998 14250999 142596230 0 0 90236 2 44 2
237 200 14250860 14251008 14251009 14251010 14251011 14256182 0 0 90237 1 74 1
238 0 14250555 14251012 14251013 14251014 14251015 14256183 0 0 90238 1 65 0
239 200 14250396 14251016 14251017 14251018 14251019 14256184 0 0 90239 2 29 1
240 75 14256035 14251020 14251021 14251022 14251023 14256185 0 0 90240 2 30 0
241 300 14250984 14251004 14251005 14251006 14251007 14256181 0 0 90241 1 53 1
242 200 14250861 0 0 0 0 0 0 0 90242 1 65 1
243 800 14250491 14251028 14251029 41251030 14251031 14256187 0 0 90243 1 65 1
244 400 14250734 14251036 14251037 14251038 14251039 14256189 0 0 90244 1 62 1
245 150 14250832 14251056 14251057 14251058 14251059 14256194 0 0 90245 0 45 2
246 400 14250731 14251060 14251061 14251062 14251063 14256200 0 0 90246 1 45 1
247 100 14250559 0 0 0 0 0 0 0 90247 2 62 1
248 500 14251056 0 0 0 0 0 0 0 90248 1 28 2
249 450 14250944 14251044 14251045 14251046 14251047 14256191 0 0 90249 1 40 1
250 600 14250413 0 0 0 0 0 0 0 90250 1 35 1
251 220 14250945 14251048 14251049 14251050 14251051 14256192 0 0 90251 2 30 1
252 300 14251046 14251076 14251077 14251078 14251079 14256198 0 0 90252 2 41 1
253 300 14251078 14251080 14251081 14251082 14251083 14256199 0 0 90253 1 33 1
254 70 14251077 0 0 0 0 0 0 0 90254 1 52 1
255 500 14256213 0 0 0 0 0 0 0 90255 2 67 1
256 100 14250414 0 0 0 0 0 0 0 90256 1 48 1
257 400 14256178 0 0 0 0 0 0 0 90257 2 42 1
258 100 14251047 0 0 0 0 0 0 0 90258 2 41 1
259 100 14251045 0 0 0 0 0 0 0 90259 1 39 1
260 50 14256189 0 0 0 0 0 0 0 90260 2 56 1
261 300 14250588 0 0 0 0 0 0 0 90261 1 49 1
262 75 14250831 0 0 0 0 0 0 0 90262 1 36 2
263 500 14250156 0 0 0 0 0 0 0 90263 1 46 1
264 40 14250893 14251040 14251041 14251042 14251043 14256190 0 0 90264 1 27 1
266 400 14250862 0 0 0 0 0 0 0 90266 1 55 1
Data collected in RDS studies is often messy, especially when information on coupons given out and received is in hard copy, then entered into a computer later. A while ago, I wrote some code to process RDS data to check for some common errors. This doesn't get rid of all of them, by any means, but will catch quite a few. This code, and the dataset, are available as a [gist](https://gist.github.com/sdwfrost/1bb5f41d4dac317be22c) from my Github site.
# The data handling code
Firstly, here is the processing code. It's fairly ugly and inefficient, as I wasn't as familiar with R as I am now.
```{r}
prepareDataset <- function(data,idColumnName,netsizeColumnName,couponColumnName,seedCoupon,couponList){
result <- matchIds(data,idColumnName,couponColumnName,seedCoupon,couponList)
result$rds.size <- as.double(data[netsizeColumnName][,1])
if(length(result$rds.size[result$rds.size<1])>0){
print("Missing values/Network size of less than one found")
}
result
}
convertToFactor <- function(data,x){
data[x] <- as.factor(data[x])
}
makeCouponFactor <- function(data,couponColumnName){
couponVector <- data[couponColumnName][,1]
f <- rep(NA,length(couponVector))
for(i in 1:length(couponVector)){
f[i] <- strsplit(as.character(couponVector[i]),split="")[[1]][1]
}
as.factor(f)
}
checkForDups <- function(couponVector,seedCoupon){
tb <- table(couponVector)
dupCoupons <- names(tb)[tb>1]
dupCoupons[dupCoupons!=as.character(seedCoupon)]
}
matchIds <- function(data,idColumnName,couponColumnName,seedCoupon,couponList){
# Extract ID numbers; I convert to characters, as the code will fail if factors are used
toId <- as.character(data[idColumnName][,1])
numIds <- length(toId)
# Extract couponnumbers
couponVector <- data[couponColumnName][,1]
# Find seeds and replace their coupon number with missing
couponRecvdList=list()
for(i in 1:numIds){
if(paste(couponVector[i])==paste(seedCoupon)){
couponRecvdList[paste(toId[i])]=as.character(NA)
}
else{
couponRecvdList[paste(toId[i])]=paste(couponVector[i])
}
}
# Generate a list of all coupon ids, and the participant ids they were given to
couponGivenList=list()
for(i in 1:length(couponList)){
cupList=data[couponList[i]][,1]
for(j in 1:numIds){
cupNum=cupList[j][1]
if(is.na(cupNum)==F){
couponGivenList[[paste(cupNum)]]=toId[j]
}
}
}
# Cross-match coupons for each individual
# Sanity check; make sure that each coupon received was actually given out by someone
# Need to add check that coupon wasn't received twice
badCoupons=c()
matchedIdList=list()
seedList=list()
for(i in 1:length(couponRecvdList)){
tmpCoupon=couponRecvdList[i]
if(!is.na(tmpCoupon)){
matchedCouponNumber=intersect(tmpCoupon,names(couponGivenList))
validCoupon=(matchedCouponNumber==tmpCoupon)
if(length(validCoupon)==0){
badCoupons=c(badCoupons,tmpCoupon)
}
else{
matchedId=couponGivenList[matchedCouponNumber]
matchedIdList[names(tmpCoupon)]=matchedId
seedList[names(tmpCoupon)]=0
}
}
else{
matchedIdList[names(tmpCoupon)]=as.character(NA)
seedList[names(tmpCoupon)]=1
}
}
if(length(badCoupons)>0){
print("Bad coupons")
print(badCoupons)
print("Invalid coupons detected")
return()
}
# Assemble results
result=data
result$rds.toId=as.character(toId)
result$rds.fromId=as.character(matchedIdList)
result$rds.seed=as.integer(seedList)
result
}
```
This will return a data frame with three additional columns, representing the subject identifiers for the recruit (`rds.toId`), recruiter (`rds.fromID`), and a flag indicating whether the subject is a seed or not (`rds.seed`).
# An example: the New York Jazz dataset
These data are from the well-known NYC jazz musician dataset by [Heckathorn and Jeffri](http://www.respondentdrivensampling.org/reports/Heckathorn.pdf). These data are a subset of the full data, in a tab-delimited file that was generated by hand from the original RDSAT file.
```{r}
nyjazz <- read.table("nyjazz.txt",header=T,row.names=NULL,sep="\t")
head(nyjazz)
```
I then attempt to process the data. The arguments are as follows:
1. The name of the dataset
2. The name of the ID of the respondent
3. The name of the column with network size
4. The column of the coupon received}
5. The numeric code for a seed coupon
6. A vector of column names corresponding to the coupons given out to each participant
```{r}
nyjazz.prep <- prepareDataset(nyjazz,"ID","netsize","mycoupon",0,c("coupon1","coupon2","coupon3","coupon4","coupon5","coupon6","coupon7","coupon8"))
```
Note that this doesn't identify any bad coupons, but does raise a warning about some undesirable values for network size.
```{r}
table(nyjazz$netsize,exclude=NULL)
```
As numeric codes are used to represent missing data in RDSAT, one has to be careful. These may represent missing network sizes or real reports of no ties.
To see what happens when individuals come in with a coupon that is not given out, let's omit one of the columns.
```{r}
prepareDataset(nyjazz,"ID","netsize","mycoupon",0,c("coupon1","coupon2","coupon3","coupon4","coupon5","coupon6","coupon7"))
```
This now gives a list of IDs, with their coupon numbers, which one should then go and check. This code doesn't check for whether two people come in with the same coupon, but this should be apparent when one comes to plotting out the RDS recruitment tree.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment