Steampipe queries persist in cache for a default 5 minutes. You can increase the TTL to keep results in cache longer, but sometimes you'll want to save query results for later analysis.
Here are some common scenarios.
You can use steampipe check
with the --export
flag to capture a CSV representation of the data that powers an individual control, or a suite of controls packaged as a benchmark.
Suppose you want to capture the data for an individual control, for example Control: 2.1.2 Ensure S3 Bucket Policy is set to deny HTTP requests. Here's the command to run that control, from a clone of AWS Compliance in ~/steampipe-mod-aws-compliance
.
steampipe check control.cis_v150_2_1_2
How can you find the name control.cis_v150_2_1_2
in order to run this command? If you run the complete CIS v1.5.0 benchmark in a dashboard, you can find its human-readable label: 2.1.2 Ensure S3 Bucket Policy is set to deny HTTP requests
. Copy the label as shown here.
Then paste it into the search box at hub.steampipe.io to find the landing page for the control. There, along with helpful instructions for remediating an alarm, you'll find a usage note that gives you the needed steampipe check
command.
Given the base command, you can use --output csv
and redirect to a file.
steampipe check control.cis_v150_2_1_2 --output csv > ~/csv/cis_1_5_s3_bucket_enforces_ssl.csv
Now, using the CSV plugin, and assuming its configured paths include ~/csv
, you can query that data.
select * from csv.s3_bucket_enforces_ssl where status = 'alarm'
To capture the data for a complete benchmark:
steampipe check benchmark.cis_v150_2_1 --output csv > ~/csv/cis_1_5_storage.csv
To query that data:
select * from csv.cis_1_5_storage where status = 'alarm'
To capture the entire suite of benchmarks:
steampipe check all --output csv > ~/csv/cis_1_5_all.csv
To query that data:
select * from csv.cis_1_5_all where status = 'alarm'
Alternatively you can run the benchmark with steampipe dashboard
, use the download button to capture CSV output, move it to your CSV directory, and run the same query.
Controls and benchmarks often use named queries.
To run a named query using steampipe query
, you'll again need to find the name. When you're operating in a mod workspace, e.g. ~/steampipe-mod-aws-insights
, you can list named queries like so.
steampipe query list
Let's look for the query behind control.cis_v150_2_1_2
select resource_name, sql from steampipe_query where resource_name ~ 's3_bucket'
Scanning the results, we find:
s3_bucket_enforces_ssl
This is an alternate, console-based way to find the named query reported on the above hub page.
To export the data:
steampipe query query.s3_bucket_enforces_ssl --export csv > ~/csv/s3_bucket_enforces_ssl.csv
The named query s3_bucket_enforces_ssl
doesn't require a parameter, but many do. For example s3_bucket_public_access is parameterized by arn. If the bucket in question is arn:aws:s3:::my-bucket-27
then you pass the arn as a parameter to the query like so:
steampipe query 'query.s3_bucket_public_access("arn:aws:s3:::my-bucket-27")`
You can then use --output csv
and redirect to a file as above.
There are several ways to do it.
steampipe query "select now(), instance_id, instance_state, account_id, region from aws_ec2_instance" --output csv > ~/csv/ec2_instance_states_2023_04_27.csv
You can then query like so:
select * from csv.ec2_instance_states_2023_04_27 where instance_state = 'running'
steampipe query
> create table ec2_instance_states_2023_04_27 as
select
now(),
instance_id,
instance_state,
account_id,
region
from
aws_ec2_instance
Now you can query like so:
> select
*
from
ec2_instance_states_2023_04_27
where
instance_state = 'running'
> create materialized view ec2_instance_states as
select
now(),
instance_id,
instance_state,
account_id,
region
from
aws_ec2_instance
What's the difference? Use create table
to save results into a table that won't change. In this example, if you're saving tables like ec2_instance_states_2023_04_27
on a daily basis, you'll never need to drop and recreate that per-day table with fresh data.
If you just want a single view of instance states that you'll update on a daily basis to record the current state of affairs, it makes sense to create a materialized view. That way, instead of dropping and recreating the table every day, you can create it once and then schedule the query steampipe query "refresh materialized view ec2_instance_states"
to update the view daily.