Skip to content

Instantly share code, notes, and snippets.

@kgjenkins
Last active June 28, 2020 16:41

Revisions

  1. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -72,7 +72,7 @@ At this point, you could save this table as a CSV file with just the columns you

    * Right-click "Joined layer" > Export > Save Features As
    * Set "Format" = "Comma Separated Value (CSV)"
    * For "File name", click `...` to specify the output location and filename ("puma2010_neighbor_pairs.csv")
    * For "File name", click `...` to specify the output location and filename ([puma2010_neighbor_pairs.csv](#file-puma2010_neighbor_pairs-csv))
    * Uncheck any fields you do not want to include
    * Skip the option options, and click "OK" at the bottom

    @@ -104,6 +104,6 @@ To save the table as a CSV file:

    * Right-click "Joined layer" > Export > Save Features As
    * Set "Format" = "Comma Separated Value (CSV)"
    * For "File name", click `...` to specify the output location and filename ("puma2010_neighbors.csv")
    * For "File name", click `...` to specify the output location and filename ([puma2010_neighbors.csv](#file-puma2010_neighbors-csv))
    * Skip the option options, and click "OK" at the bottom

  2. kgjenkins revised this gist Jun 27, 2020. 2 changed files with 15999 additions and 2 deletions.
    15,997 changes: 15,997 additions & 0 deletions puma2010_neighbor_pairs.csv
    15,997 additions, 0 deletions not shown because the diff is too large. Please use a local Git client to view these changes.
    4 changes: 2 additions & 2 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -72,7 +72,7 @@ At this point, you could save this table as a CSV file with just the columns you

    * Right-click "Joined layer" > Export > Save Features As
    * Set "Format" = "Comma Separated Value (CSV)"
    * For "File name", click `...` to specify the output location and filename
    * For "File name", click `...` to specify the output location and filename ("puma2010_neighbor_pairs.csv")
    * Uncheck any fields you do not want to include
    * Skip the option options, and click "OK" at the bottom

    @@ -104,6 +104,6 @@ To save the table as a CSV file:

    * Right-click "Joined layer" > Export > Save Features As
    * Set "Format" = "Comma Separated Value (CSV)"
    * For "File name", click `...` to specify the output location and filename
    * For "File name", click `...` to specify the output location and filename ("puma2010_neighbors.csv")
    * Skip the option options, and click "OK" at the bottom

  3. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -77,9 +77,9 @@ At this point, you could save this table as a CSV file with just the columns you
    * Skip the option options, and click "OK" at the bottom


    ## Group neighbors into a cell
    ## Group neighbors into a single value for each PUMA

    It may be preferable to group together all the neighbors for each county, like this:
    It may be preferable to group together all the neighbors for each PUMA, like this:

    | GEOID10 | neighbors |
    |-|-|
  4. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 10 additions and 9 deletions.
    19 changes: 10 additions & 9 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -7,11 +7,12 @@ Public Use Microdata Areas ([PUMA](https://www.census.gov/programs-surveys/geogr
    The US Census has PUMA boundary shapefiles available by state. Here's the directory of all the 2010 PUMAs, 2019 vintage:
    https://www2.census.gov/geo/tiger/TIGER2019/PUMA/

    Download manually one at a time, or automate using [wget](https://www.gnu.org/software/wget/):
    You can manually download all the files one at a time, or automate the process using [wget](https://www.gnu.org/software/wget/):
    ```
    wget -A zip -r -l1 ftp://ftp2.census.gov/geo/tiger/TIGER2019/PUMA/
    ```

    About the wget command-line options used above:
    `-A` limits to specific file extensions
    `-r` means recursive

    @@ -20,12 +21,12 @@ wget -A zip -r -l1 ftp://ftp2.census.gov/geo/tiger/TIGER2019/PUMA/

    Before calculating neighboring PUMAs, we'll want to merge the separate state files into a single nationwide data file. Note that QGIS can read the zipfiles without us having to extract them first. (Although extracting is usually recommended if you are not immediately saving to another format.)

    * In QGIS, under the Processing menu, open the Toolbox.
    * Search for "Merge vector layers" and double-click it.
    * For "Input layers", click the `...` button and "Add Directory", selecting the folder where the zipfiles were saved.
    * In QGIS, under the Processing menu, open the Toolbox
    * Search for "Merge vector layers" and double-click it
    * For "Input layers", click the `...` button and "Add Directory", selecting the folder where you saved the zipfiles
    * Once you have see all the state files listed, click "OK"
    * Leave the "Destination CRS" blank (it will use the CRS from the files).
    * Leave "Merged" set to "[Create temporary layer]" so we can check the output before saving.
    * Leave the "Destination CRS" blank (it will use the CRS from the files)
    * Leave "Merged" set to "[Create temporary layer]" so we can check the output before saving
    * Click "Run"

    After a few seconds, you should see the US on the map, and a new temporary layer called "Merged" in the list of layers.
    @@ -43,7 +44,7 @@ If it looks good, right-click "Merged" > "Make Permanent..."
    The GEOID10 column contains the nationwide unique IDs for each PUMA. We want to add a new column that will contain a list of all the IDs of PUMAs that are adjacent to a given PUMA. There are several ways to do this in QGIS, but the fastest technique is to use the "Join attributes by location" processing tool.

    * In QGIS, under the Processing menu, open the Toolbox.
    * Search for "Join attributes by location" and double-click it.
    * Search for "Join attributes by location" and double-click it
    * Set "Base Layer" = "puma2010"
    * Set "Join Layer" = "puma2010"
    * "Geometric predicate" should be "intersects"
    @@ -53,9 +54,9 @@ The GEOID10 column contains the nationwide unique IDs for each PUMA. We want to

    Wait until processing is complete (about 50 seconds), and then you should see a new layer called "Joined Layer".

    * Close the processing window.
    * Close the processing window
    * Right-click the "Joined layer" > "Open Attribute Table"
    * Click the "GEOID10" column header to sort by that column.
    * Click the "GEOID10" column header to sort by that column

    You'll notice that the same record is repeated several times, but if you scroll to the right, you'll see a new column called "GEOID10_2", which has the ID of one of the neighboring PUMAs. The basic table structure is like this (omitting most of the other columns):

  5. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion readme.md
    Original file line number Diff line number Diff line change
    @@ -18,7 +18,7 @@ wget -A zip -r -l1 ftp://ftp2.census.gov/geo/tiger/TIGER2019/PUMA/

    ## Merge to single layer

    We can use [QGIS](https://qgis.org/) to merge the separate state files into a single data file. QGIS can read the zipfiles without us having to extract them. (Although extracting is usually recommended if you are not immediately saving to another format.)
    Before calculating neighboring PUMAs, we'll want to merge the separate state files into a single nationwide data file. Note that QGIS can read the zipfiles without us having to extract them first. (Although extracting is usually recommended if you are not immediately saving to another format.)

    * In QGIS, under the Processing menu, open the Toolbox.
    * Search for "Merge vector layers" and double-click it.
  6. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 4 additions and 2 deletions.
    6 changes: 4 additions & 2 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -1,11 +1,13 @@
    # Calculate adjacent PUMAs

    Public Use Microdata Areas ([PUMA](https://www.census.gov/programs-surveys/geography/guidance/geo-areas/pumas.html)), are statistical geographic areas defined by the US Census. For certain types of analysis, it is useful to determine which PUMAs are adjacent to each other. The following instructions show how to download the PUMA boundaries and then use [QGIS](https://qgis.org/) to calculate neighboring PUMAs, and export the results as a CSV file. This same technique could be used with other boundary polygons.

    ## Download PUMA boundaries

    The US Census has PUMA boundary shapefiles available by state. Here's the directory of all the 2010 PUMAs, 2019 vintage:
    https://www2.census.gov/geo/tiger/TIGER2019/PUMA/

    Download manually one at a time, or automate using wget:
    Download manually one at a time, or automate using [wget](https://www.gnu.org/software/wget/):
    ```
    wget -A zip -r -l1 ftp://ftp2.census.gov/geo/tiger/TIGER2019/PUMA/
    ```
    @@ -16,7 +18,7 @@ wget -A zip -r -l1 ftp://ftp2.census.gov/geo/tiger/TIGER2019/PUMA/

    ## Merge to single layer

    Next, we'll use QGIS to merge the separate state files into a single data file. QGIS can read the zipfiles without us having to extract them. (Although extracting is usually recommended if you are not immediately saving to another format.)
    We can use [QGIS](https://qgis.org/) to merge the separate state files into a single data file. QGIS can read the zipfiles without us having to extract them. (Although extracting is usually recommended if you are not immediately saving to another format.)

    * In QGIS, under the Processing menu, open the Toolbox.
    * Search for "Merge vector layers" and double-click it.
  7. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 2381 additions and 1 deletion.
    2,382 changes: 2,381 additions & 1 deletion puma2010_neighbors.csv
    2,381 additions, 1 deletion not shown because the diff is too large. Please use a local Git client to view these changes.
  8. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions puma2010_neighbors.csv
    Original file line number Diff line number Diff line change
    @@ -0,0 +1 @@
    GEOID10,neighbors
  9. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -86,9 +86,9 @@ It may be preferable to group together all the neighbors for each county, like t
    We can restructure the data using a QGIS virtual layer:

    * Layer menu > Create Layer > New Virtual Layer...
    * Paste the following query (you can omit "NAMELSAD10" if you don't need it, or you could add other column names if needed):
    * Paste the following query:
    ```
    select GEOID10, NAMELSAD10, group_concat(geoid10_2) as neighbors
    select GEOID10, group_concat(geoid10_2) as neighbors
    from "Joined layer"
    group by GEOID10
    ```
  10. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 4 additions and 4 deletions.
    8 changes: 4 additions & 4 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -78,10 +78,10 @@ At this point, you could save this table as a CSV file with just the columns you

    It may be preferable to group together all the neighbors for each county, like this:

    | GEOID10 | count | neighbors |
    |-|-|-|
    | 1 | 3 | 2,3,7 |
    | 2 | 2 | 1,4 |
    | GEOID10 | neighbors |
    |-|-|
    | 1 | 2,3,7 |
    | 2 | 1,4 |

    We can restructure the data using a QGIS virtual layer:

  11. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -88,9 +88,9 @@ We can restructure the data using a QGIS virtual layer:
    * Layer menu > Create Layer > New Virtual Layer...
    * Paste the following query (you can omit "NAMELSAD10" if you don't need it, or you could add other column names if needed):
    ```
    select geoid10, namelsad10, count(*) as count, group_concat(geoid10_2) as neighbors
    select GEOID10, NAMELSAD10, group_concat(geoid10_2) as neighbors
    from "Joined layer"
    group by geoid10
    group by GEOID10
    ```
    * Under "Geometry", select "No geometry"
    * Click "Add", then "Close"
  12. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 21 additions and 9 deletions.
    30 changes: 21 additions & 9 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -74,21 +74,33 @@ At this point, you could save this table as a CSV file with just the columns you
    * Skip the option options, and click "OK" at the bottom


    ## Group neighbors into a single value
    ## Group neighbors into a cell

    It may be preferable to group together all the neighbors for each county, like this:

    | GEOID10 | neighbors |
    |-|-|
    | 1 | 2,3,7 |
    | 2 | 1,4 |
    | GEOID10 | count | neighbors |
    |-|-|-|
    | 1 | 3 | 2,3,7 |
    | 2 | 2 | 1,4 |

    We can restructure the data using a QGIS virtual layer:



    * Layer menu > Create Layer > New Virtual Layer...
    * Paste the following query (you can omit "NAMELSAD10" if you don't need it, or you could add other column names if needed):
    ```
    select geoid10, fieldcalc, group_concat(geoid10_2)
    select geoid10, namelsad10, count(*) as count, group_concat(geoid10_2) as neighbors
    from "Joined layer"
    group by geoid10
    ```
    ```
    * Under "Geometry", select "No geometry"
    * Click "Add", then "Close"

    A new table layer "virtual_layer" should appear. Right-click > Open Attribute Table to view it.

    To save the table as a CSV file:

    * Right-click "Joined layer" > Export > Save Features As
    * Set "Format" = "Comma Separated Value (CSV)"
    * For "File name", click `...` to specify the output location and filename
    * Skip the option options, and click "OK" at the bottom

  13. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 21 additions and 1 deletion.
    22 changes: 21 additions & 1 deletion readme.md
    Original file line number Diff line number Diff line change
    @@ -65,8 +65,28 @@ You'll notice that the same record is repeated several times, but if you scroll
    | 2 | ... | 1 |
    | 2 | ... | 4 |

    At this point, you could save this table as a CSV file with just the columns you need:

    * Right-click "Joined layer" > Export > Save Features As
    * Set "Format" = "Comma Separated Value (CSV)"
    * For "File name", click `...` to specify the output location and filename
    * Uncheck any fields you do not want to include
    * Skip the option options, and click "OK" at the bottom


    ## Group neighbors into a single value

    It may be preferable to group together all the neighbors for each county, like this:

    | GEOID10 | neighbors |
    |-|-|
    | 1 | 2,3,7 |
    | 2 | 1,4 |

    We can restructure the data using a QGIS virtual layer:



    Spatial Join (one-to-many), then
    ```
    select geoid10, fieldcalc, group_concat(geoid10_2)
    from "Joined layer"
  14. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 26 additions and 25 deletions.
    51 changes: 26 additions & 25 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -7,12 +7,11 @@ https://www2.census.gov/geo/tiger/TIGER2019/PUMA/

    Download manually one at a time, or automate using wget:
    ```
    wget -A img -r -l1 ftp://ftp2.census.gov/geo/tiger/TIGER2019/PUMA/
    wget -A zip -r -l1 ftp://ftp2.census.gov/geo/tiger/TIGER2019/PUMA/
    ```

    `-A` limits to specific file extensions
    `-r` means recursive
    `-l1` limits recursiveness to one level


    ## Merge to single layer
    @@ -39,31 +38,33 @@ If it looks good, right-click "Merged" > "Make Permanent..."

    ## Calculate adjacent polygons

    The GEOID10 column contains the nationwide unique IDs for each PUMA. We want to add a new column that will contain a list of all the IDs of PUMAs that are adjacent to a given PUMA. We can do this using an aggregate function in the field calculator, as described by Ujaval Gandhi's 2019-05-23 blog post, [Find Neighbor Polygons using Summary Aggregate Function in QGIS](https://spatialthoughts.com/2019/05/23/neighbor-polygons-aggregate-qgis/). The technique works best with two separate layers, so first we'll duplicate the layer, and use the duplicate layer within the aggregate function.

    * Right-click "PUMA" > Duplicate Layer (it will automatically be named "PUMA copy")
    * Right-click "PUMA" > Open Attribute Table
    * In the table toolbar, click the "Field Calculator" button (towards the right, looks like an abacus)
    * Set "Output field name" = "adjacent"
    * Set "Output field type" = "Text (string)"
    * Set the expression (large white area on left) to the following:
    ```
    aggregate(
    layer:= 'PUMA copy',
    aggregate:= 'concatenate',
    expression:= GEOID10,
    concatenator:= ',',
    filter:= GEOID10<>attribute(@parent, 'GEOID10') and intersects($geometry, geometry(@parent))
    )
    ```

    Notice the preview below your expression. It should look something like '0200300,0200200,0200102'.

    Click "OK" and wait for the calculation to finish. It may take around 20 minutes for the whole US.
    The GEOID10 column contains the nationwide unique IDs for each PUMA. We want to add a new column that will contain a list of all the IDs of PUMAs that are adjacent to a given PUMA. There are several ways to do this in QGIS, but the fastest technique is to use the "Join attributes by location" processing tool.

    * In QGIS, under the Processing menu, open the Toolbox.
    * Search for "Join attributes by location" and double-click it.
    * Set "Base Layer" = "puma2010"
    * Set "Join Layer" = "puma2010"
    * "Geometric predicate" should be "intersects"
    * For "Fields to add", click the `...` and select "GEOID10"
    * "Join type" should be "Create separate feature for each matching feature (one-to-many)"
    * Leave the other settings as is, and click "Run"

    Wait until processing is complete (about 50 seconds), and then you should see a new layer called "Joined Layer".

    * Close the processing window.
    * Right-click the "Joined layer" > "Open Attribute Table"
    * Click the "GEOID10" column header to sort by that column.

    You'll notice that the same record is repeated several times, but if you scroll to the right, you'll see a new column called "GEOID10_2", which has the ID of one of the neighboring PUMAs. The basic table structure is like this (omitting most of the other columns):

    | GEOID10 | ... | GEOID10_2 |
    |-|-|-|
    | 1 | ... | 2 |
    | 1 | ... | 3 |
    | 1 | ... | 7 |
    | 2 | ... | 1 |
    | 2 | ... | 4 |

    FASTER TECHNIQUE:

    Spatial Join (one-to-many), then
    ```
  15. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 8 additions and 0 deletions.
    8 changes: 8 additions & 0 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -63,3 +63,11 @@ Notice the preview below your expression. It should look something like '020030
    Click "OK" and wait for the calculation to finish. It may take around 20 minutes for the whole US.


    FASTER TECHNIQUE:

    Spatial Join (one-to-many), then
    ```
    select geoid10, fieldcalc, group_concat(geoid10_2)
    from "Joined layer"
    group by geoid10
    ```
  16. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion readme.md
    Original file line number Diff line number Diff line change
    @@ -60,6 +60,6 @@ aggregate(

    Notice the preview below your expression. It should look something like '0200300,0200200,0200102'.

    Click "OK" and wait for the calculation to finish. It may take around 20 minutes.
    Click "OK" and wait for the calculation to finish. It may take around 20 minutes for the whole US.


  17. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 7 additions and 7 deletions.
    14 changes: 7 additions & 7 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -49,14 +49,14 @@ The GEOID10 column contains the nationwide unique IDs for each PUMA. We want to
    * Set the expression (large white area on left) to the following:
    ```
    aggregate(
    layer:= 'PUMA copy',
    aggregate:='concatenate',
    expression:=GEOID10,
    concatenator:=',',
    filter:=touches($geometry, geometry(@parent))
    )
    layer:= 'PUMA copy',
    aggregate:= 'concatenate',
    expression:= GEOID10,
    concatenator:= ',',
    filter:= GEOID10<>attribute(@parent, 'GEOID10') and intersects($geometry, geometry(@parent))
    )
    ```
    Note that we use the function `touches()` instead of `intersects` (which would include the PUMA itself). Census boundary data is pretty good, with adjacent polygons sharing the same exact boundary, so `touches` should work well.

    Notice the preview below your expression. It should look something like '0200300,0200200,0200102'.

  18. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 6 additions and 5 deletions.
    11 changes: 6 additions & 5 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -17,7 +17,7 @@ wget -A img -r -l1 ftp://ftp2.census.gov/geo/tiger/TIGER2019/PUMA/

    ## Merge to single layer

    QGIS can read the zipfiles without us having to extract them. (Although extracting is usually recommended if you are not immediately saving to another format.)
    Next, we'll use QGIS to merge the separate state files into a single data file. QGIS can read the zipfiles without us having to extract them. (Although extracting is usually recommended if you are not immediately saving to another format.)

    * In QGIS, under the Processing menu, open the Toolbox.
    * Search for "Merge vector layers" and double-click it.
    @@ -39,16 +39,17 @@ If it looks good, right-click "Merged" > "Make Permanent..."

    ## Calculate adjacent polygons

    The GEOID10 column contains the nationwide unique IDs for each PUMA. We want to add a new column that will contain a list of all the IDs of PUMAs that are adjacent to a given PUMA. We can do this using an aggregate function in the field calculator, as described by Ujaval Gandhi's 2019-05-23 blog post, ["Find Neighbor Polygons using Summary Aggregate Function in QGIS"](https://spatialthoughts.com/2019/05/23/neighbor-polygons-aggregate-qgis/).
    The GEOID10 column contains the nationwide unique IDs for each PUMA. We want to add a new column that will contain a list of all the IDs of PUMAs that are adjacent to a given PUMA. We can do this using an aggregate function in the field calculator, as described by Ujaval Gandhi's 2019-05-23 blog post, [Find Neighbor Polygons using Summary Aggregate Function in QGIS](https://spatialthoughts.com/2019/05/23/neighbor-polygons-aggregate-qgis/). The technique works best with two separate layers, so first we'll duplicate the layer, and use the duplicate layer within the aggregate function.

    * Right-click "PUMA" > "Open Attribute Table"
    * Right-click "PUMA" > Duplicate Layer (it will automatically be named "PUMA copy")
    * Right-click "PUMA" > Open Attribute Table
    * In the table toolbar, click the "Field Calculator" button (towards the right, looks like an abacus)
    * Set "Output field name" = "adjacent"
    * Set "Output field type" = "Text (string)"
    * Set the expression (large white area on left) to the following:
    ```
    aggregate(
    layer:= 'PUMA',
    layer:= 'PUMA copy',
    aggregate:='concatenate',
    expression:=GEOID10,
    concatenator:=',',
    @@ -59,6 +60,6 @@ Note that we use the function `touches()` instead of `intersects` (which would i

    Notice the preview below your expression. It should look something like '0200300,0200200,0200102'.

    Click "OK" and wait for the calculation to finish. It may take several minutes...
    Click "OK" and wait for the calculation to finish. It may take around 20 minutes.


  19. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -10,8 +10,8 @@ Download manually one at a time, or automate using wget:
    wget -A img -r -l1 ftp://ftp2.census.gov/geo/tiger/TIGER2019/PUMA/
    ```

    `-A` limits to specific file extensions
    `-r` means recursive
    `-A` limits to specific file extensions
    `-r` means recursive
    `-l1` limits recursiveness to one level


  20. kgjenkins revised this gist Jun 27, 2020. 1 changed file with 3 additions and 6 deletions.
    9 changes: 3 additions & 6 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -1,7 +1,6 @@
    # Calculate adjacent PUMAs

    Download PUMA boundaries
    ===============
    ## Download PUMA boundaries

    The US Census has PUMA boundary shapefiles available by state. Here's the directory of all the 2010 PUMAs, 2019 vintage:
    https://www2.census.gov/geo/tiger/TIGER2019/PUMA/
    @@ -16,8 +15,7 @@ wget -A img -r -l1 ftp://ftp2.census.gov/geo/tiger/TIGER2019/PUMA/
    `-l1` limits recursiveness to one level


    Merge to single layer
    =====================
    ## Merge to single layer

    QGIS can read the zipfiles without us having to extract them. (Although extracting is usually recommended if you are not immediately saving to another format.)

    @@ -39,8 +37,7 @@ If it looks good, right-click "Merged" > "Make Permanent..."
    * Right-click "Merged" > "Rename layer" and rename it "PUMA"


    Calculate adjacent polygons
    ===========================
    ## Calculate adjacent polygons

    The GEOID10 column contains the nationwide unique IDs for each PUMA. We want to add a new column that will contain a list of all the IDs of PUMAs that are adjacent to a given PUMA. We can do this using an aggregate function in the field calculator, as described by Ujaval Gandhi's 2019-05-23 blog post, ["Find Neighbor Polygons using Summary Aggregate Function in QGIS"](https://spatialthoughts.com/2019/05/23/neighbor-polygons-aggregate-qgis/).

  21. kgjenkins created this gist Jun 27, 2020.
    67 changes: 67 additions & 0 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,67 @@
    # Calculate adjacent PUMAs

    Download PUMA boundaries
    ===============

    The US Census has PUMA boundary shapefiles available by state. Here's the directory of all the 2010 PUMAs, 2019 vintage:
    https://www2.census.gov/geo/tiger/TIGER2019/PUMA/

    Download manually one at a time, or automate using wget:
    ```
    wget -A img -r -l1 ftp://ftp2.census.gov/geo/tiger/TIGER2019/PUMA/
    ```

    `-A` limits to specific file extensions
    `-r` means recursive
    `-l1` limits recursiveness to one level


    Merge to single layer
    =====================

    QGIS can read the zipfiles without us having to extract them. (Although extracting is usually recommended if you are not immediately saving to another format.)

    * In QGIS, under the Processing menu, open the Toolbox.
    * Search for "Merge vector layers" and double-click it.
    * For "Input layers", click the `...` button and "Add Directory", selecting the folder where the zipfiles were saved.
    * Once you have see all the state files listed, click "OK"
    * Leave the "Destination CRS" blank (it will use the CRS from the files).
    * Leave "Merged" set to "[Create temporary layer]" so we can check the output before saving.
    * Click "Run"

    After a few seconds, you should see the US on the map, and a new temporary layer called "Merged" in the list of layers.

    If it looks good, right-click "Merged" > "Make Permanent..."

    * GeoPackage (the default format) is a fine format -- just one file!
    * For "File name", click the `...` to specify the output location and filename (something like 'puma2010.gpkg')
    * Ignore all the other options, and click "OK"
    * Right-click "Merged" > "Rename layer" and rename it "PUMA"


    Calculate adjacent polygons
    ===========================

    The GEOID10 column contains the nationwide unique IDs for each PUMA. We want to add a new column that will contain a list of all the IDs of PUMAs that are adjacent to a given PUMA. We can do this using an aggregate function in the field calculator, as described by Ujaval Gandhi's 2019-05-23 blog post, ["Find Neighbor Polygons using Summary Aggregate Function in QGIS"](https://spatialthoughts.com/2019/05/23/neighbor-polygons-aggregate-qgis/).

    * Right-click "PUMA" > "Open Attribute Table"
    * In the table toolbar, click the "Field Calculator" button (towards the right, looks like an abacus)
    * Set "Output field name" = "adjacent"
    * Set "Output field type" = "Text (string)"
    * Set the expression (large white area on left) to the following:
    ```
    aggregate(
    layer:= 'PUMA',
    aggregate:='concatenate',
    expression:=GEOID10,
    concatenator:=',',
    filter:=touches($geometry, geometry(@parent))
    )
    ```
    Note that we use the function `touches()` instead of `intersects` (which would include the PUMA itself). Census boundary data is pretty good, with adjacent polygons sharing the same exact boundary, so `touches` should work well.

    Notice the preview below your expression. It should look something like '0200300,0200200,0200102'.

    Click "OK" and wait for the calculation to finish. It may take several minutes...