Here's part of a query to list, combine, and contextualize IP addresses from various parts of AWS.
ec2 as (
-- Check ec2 so we can provide better details
SELECT v.ip as ip,
'ec2' as source,
title as title,
instance_id as id,
region as region,
account_id
FROM aws_ec2_instance
CROSS JOIN LATERAL (
VALUES (public_ip_address),
(private_ip_address)
) as v(ip)
right join target on target.ip = v.ip
),
eni as (
-- ENI covers all active ip addresses
SELECT v.ip as ip,
interface_type as source,
description as title,
network_interface_id as id,
region as region,
account_id
FROM aws_ec2_network_interface
CROSS JOIN LATERAL (
VALUES (private_ip_address),
(association_public_ip),
(association_customer_owned_ip)
) as v(ip)
right join target on target.ip = v.ip
where attached_instance_id is null -- ignore ec2 instances
),
The values
statement in each CTE can provide arbitrarily many flavors of IP address from its base table.
Let's load some sample data and explore how the cross join lateral works.
CREATE TABLE aws_ec2_instance (
title TEXT,
instance_id TEXT,
region TEXT,
account_id TEXT,
public_ip_address TEXT,
private_ip_address TEXT
);
INSERT INTO aws_ec2_instance VALUES
('Instance 1', 'i-1234567890abcdef0', 'us-west-2', '123456789012', '54.194.252.215', '172.31.16.139'),
('Instance 2', 'i-0abcdef1234567890', 'us-east-1', '210987654321', '34.192.2.143', '172.31.17.210');
CREATE TABLE target (
ip TEXT
);
INSERT INTO target VALUES
('54.194.252.215'),
('172.31.17.210');
Per ChatGPT:
So, the CROSS JOIN LATERAL (VALUES (public_ip_address), (private_ip_address)) as v(ip) part of your query is creating a new table v(ip) that contains two rows for each row in the aws_ec2_instance table, and then cross-joining this new table with the aws_ec2_instance table. This effectively doubles the number of rows in the aws_ec2_instance table, with each EC2 instance represented twice: once with its public IP address and once with its private IP address.
Focusing on just the ec2
CTE, we can look at the new table before the final right join to see that.
with data as (
select *
from aws_ec2_instance
cross join lateral (
values (public_ip_address),
(private_ip_address)
) as v(ip)
)
select * from data;
title | instance_id | region | account_id | public_ip_address | private_ip_address | ip
------------+---------------------+-----------+--------------+-------------------+--------------------+----------------
Instance 1 | i-1234567890abcdef0 | us-west-2 | 123456789012 | 54.194.252.215 | 172.31.16.139 | 54.194.252.215
Instance 1 | i-1234567890abcdef0 | us-west-2 | 123456789012 | 54.194.252.215 | 172.31.16.139 | 172.31.16.139
Instance 2 | i-0abcdef1234567890 | us-east-1 | 210987654321 | 34.192.2.143 | 172.31.17.210 | 34.192.2.143
Instance 2 | i-0abcdef1234567890 | us-east-1 | 210987654321 | 34.192.2.143 | 172.31.17.210 | 172.31.17.210
This the final result for the ec2
CTE.
select v.ip as ip,
'ec2' as source,
title as title,
instance_id as id,
region as region,
account_id
from aws_ec2_instance
cross join lateral (
values (public_ip_address),
(private_ip_address)
) as v(ip)
right join target on target.ip = v.ip;
ip | source | title | id | region | account_id
----------------+--------+------------+---------------------+-----------+--------------
172.31.17.210 | ec2 | Instance 2 | i-0abcdef1234567890 | us-east-1 | 210987654321
54.194.252.215 | ec2 | Instance 1 | i-1234567890abcdef0 | us-west-2 | 123456789012
The CTEs use a common schema, so you can union
them together.
select * from ec2
union
select * from eni
I showed this explanation to the author of the query, and here was his response.
It successfully describes the what but not necessarily the why.
There are several approaches that all solve the problem, but the reason the cross join lateral is used here is to perform the expansion while only making a single query to the AWS table
Other methods would require iterating through the table once for each different column that you wanted to expose as IP.