Skip to content

Instantly share code, notes, and snippets.

@judell
Last active July 16, 2023 23:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save judell/c8d168f3384d2bed9ccfd74755c572d7 to your computer and use it in GitHub Desktop.
Save judell/c8d168f3384d2bed9ccfd74755c572d7 to your computer and use it in GitHub Desktop.
cross join lateral example

Here's part of a query to list, combine, and contextualize IP addresses from various parts of AWS.

ec2 as (
    -- Check ec2 so we can provide better details
    SELECT v.ip as ip,
        'ec2' as source,
        title as title,
        instance_id as id,
        region as region,
        account_id
    FROM aws_ec2_instance
        CROSS JOIN LATERAL (
            VALUES (public_ip_address),
                (private_ip_address)
        ) as v(ip)
        right join target on target.ip = v.ip
),
eni as (
    -- ENI covers all active ip addresses 
    SELECT v.ip as ip,
        interface_type as source,
        description as title,
        network_interface_id as id,
        region as region,
        account_id
    FROM aws_ec2_network_interface
        CROSS JOIN LATERAL (
            VALUES (private_ip_address),
                (association_public_ip),
                (association_customer_owned_ip)
        ) as v(ip)
        right join target on target.ip = v.ip
    where attached_instance_id is null -- ignore ec2 instances
),

The values statement in each CTE can provide arbitrarily many flavors of IP address from its base table.

How it works

Let's load some sample data and explore how the cross join lateral works.

CREATE TABLE aws_ec2_instance (
    title TEXT,
    instance_id TEXT,
    region TEXT,
    account_id TEXT,
    public_ip_address TEXT,
    private_ip_address TEXT
);

INSERT INTO aws_ec2_instance VALUES
('Instance 1', 'i-1234567890abcdef0', 'us-west-2', '123456789012', '54.194.252.215', '172.31.16.139'),
('Instance 2', 'i-0abcdef1234567890', 'us-east-1', '210987654321', '34.192.2.143', '172.31.17.210');
CREATE TABLE target (
    ip TEXT
);

INSERT INTO target VALUES
('54.194.252.215'),
('172.31.17.210');

Per ChatGPT:

So, the CROSS JOIN LATERAL (VALUES (public_ip_address), (private_ip_address)) as v(ip) part of your query is creating a new table v(ip) that contains two rows for each row in the aws_ec2_instance table, and then cross-joining this new table with the aws_ec2_instance table. This effectively doubles the number of rows in the aws_ec2_instance table, with each EC2 instance represented twice: once with its public IP address and once with its private IP address.

Focusing on just the ec2 CTE, we can look at the new table before the final right join to see that.

with data as (
select *
from aws_ec2_instance
    cross join lateral (
        values (public_ip_address),
            (private_ip_address)
    ) as v(ip)
)
select * from data;
   title    |     instance_id     |  region   |  account_id  | public_ip_address | private_ip_address |       ip
------------+---------------------+-----------+--------------+-------------------+--------------------+----------------
 Instance 1 | i-1234567890abcdef0 | us-west-2 | 123456789012 | 54.194.252.215    | 172.31.16.139      | 54.194.252.215
 Instance 1 | i-1234567890abcdef0 | us-west-2 | 123456789012 | 54.194.252.215    | 172.31.16.139      | 172.31.16.139
 Instance 2 | i-0abcdef1234567890 | us-east-1 | 210987654321 | 34.192.2.143      | 172.31.17.210      | 34.192.2.143
 Instance 2 | i-0abcdef1234567890 | us-east-1 | 210987654321 | 34.192.2.143      | 172.31.17.210      | 172.31.17.210

This the final result for the ec2 CTE.

select v.ip as ip,
  'ec2' as source,
  title as title,
  instance_id as id,
  region as region,
  account_id
from aws_ec2_instance
  cross join lateral (
     values (public_ip_address),
            (private_ip_address)
    ) as v(ip)
right join target on target.ip = v.ip;
       ip       | source |   title    |         id          |  region   |  account_id
----------------+--------+------------+---------------------+-----------+--------------
 172.31.17.210  | ec2    | Instance 2 | i-0abcdef1234567890 | us-east-1 | 210987654321
 54.194.252.215 | ec2    | Instance 1 | i-1234567890abcdef0 | us-west-2 | 123456789012

The CTEs use a common schema, so you can union them together.

select * from ec2
union
select * from eni

Why do it this way?

I showed this explanation to the author of the query, and here was his response.

It successfully describes the what but not necessarily the why.

There are several approaches that all solve the problem, but the reason the cross join lateral is used here is to perform the expansion while only making a single query to the AWS table

Other methods would require iterating through the table once for each different column that you wanted to expose as IP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment