Skip to content

Instantly share code, notes, and snippets.

@arpit15
Created June 30, 2022 15:16
Show Gist options
  • Save arpit15/807338e7c701b777186145419ea19c76 to your computer and use it in GitHub Desktop.
Save arpit15/807338e7c701b777186145419ea19c76 to your computer and use it in GitHub Desktop.
ubuntu@ip-10-0-0-21:/var/log$ cat chef-client.log
# Logfile created on 2022-06-30 15:08:39 +0000 by logger.rb/v1.4.3
[2022-06-30T15:08:41+00:00] INFO: Started Cinc Zero at chefzero://localhost:1 with repository at /etc/chef (One version per cookbook)
Starting Cinc Client, version 17.2.29
Patents: https://www.chef.io/patents
[2022-06-30T15:08:42+00:00] INFO: *** Cinc Client 17.2.29 ***
[2022-06-30T15:08:42+00:00] INFO: Platform: x86_64-linux
[2022-06-30T15:08:42+00:00] INFO: Cinc-client pid: 1116
[2022-06-30T15:08:46+00:00] WARN: Run List override has been provided.
[2022-06-30T15:08:46+00:00] WARN: Original Run List: []
[2022-06-30T15:08:46+00:00] WARN: Overridden Run List: [recipe[aws-parallelcluster::init]]
[2022-06-30T15:08:46+00:00] INFO: Run List is [recipe[aws-parallelcluster::init]]
[2022-06-30T15:08:46+00:00] INFO: Run List expands to [aws-parallelcluster::init]
[2022-06-30T15:08:46+00:00] INFO: Starting Cinc Client Run for ip-10-0-0-21.us-east-2.compute.internal
[2022-06-30T15:08:46+00:00] INFO: Running start handlers
[2022-06-30T15:08:46+00:00] INFO: Start handlers complete.
resolving cookbooks for run list: ["aws-parallelcluster::init"]
[2022-06-30T15:08:51+00:00] INFO: Loading cookbooks [aws-parallelcluster@3.1.4, apt@7.4.0, iptables@8.0.0, line@4.0.1, nfs@2.6.4, openssh@2.9.1, pyenv@3.4.2, selinux@3.1.1, yum@6.1.1, yum-epel@4.1.2, aws-parallelcluster-install@3.1.4, aws-parallelcluster-config@3.1.4, aws-parallelcluster-slurm@3.1.4, aws-parallelcluster-scheduler-plugin@3.1.4, aws-parallelcluster-awsbatch@3.1.4, aws-parallelcluster-test@3.1.4]
[2022-06-30T15:08:51+00:00] INFO: Skipping removal of obsoleted cookbooks from the cache
Synchronizing Cookbooks:
- aws-parallelcluster (3.1.4)
- apt (7.4.0)
- iptables (8.0.0)
- line (4.0.1)
- nfs (2.6.4)
- openssh (2.9.1)
- pyenv (3.4.2)
- selinux (3.1.1)
- yum (6.1.1)
- yum-epel (4.1.2)
- aws-parallelcluster-install (3.1.4)
- aws-parallelcluster-config (3.1.4)
- aws-parallelcluster-slurm (3.1.4)
- aws-parallelcluster-scheduler-plugin (3.1.4)
- aws-parallelcluster-awsbatch (3.1.4)
- aws-parallelcluster-test (3.1.4)
Installing Cookbook Gems:
Compiling Cookbooks...
[2022-06-30T15:08:54+00:00] INFO: Detected bootstrap file aws-parallelcluster-cookbook-3.1.4
[2022-06-30T15:08:54+00:00] INFO: Appending search domain 'compute1.pcluster.' to /etc/systemd/resolved.conf
[2022-06-30T15:08:54+00:00] INFO: Restarting 'systemd-resolved' service, platform ubuntu '20.04'
Converging 24 resources
Recipe: aws-parallelcluster-config::init
* template[/etc/parallelcluster/cfnconfig] action create[2022-06-30T15:08:55+00:00] INFO: Processing template[/etc/parallelcluster/cfnconfig] action create (aws-parallelcluster-config::init line 39)
[2022-06-30T15:08:55+00:00] INFO: template[/etc/parallelcluster/cfnconfig] created file /etc/parallelcluster/cfnconfig
- create new file /etc/parallelcluster/cfnconfig[2022-06-30T15:08:55+00:00] INFO: template[/etc/parallelcluster/cfnconfig] updated file contents /etc/parallelcluster/cfnconfig
- update content in file /etc/parallelcluster/cfnconfig from none to 0cf156
--- /etc/parallelcluster/cfnconfig 2022-06-30 15:08:55.602445872 +0000
+++ /etc/parallelcluster/.chef-cfnconfig20220630-1116-jotb8o 2022-06-30 15:08:55.602445872 +0000
@@ -1 +1,16 @@
+stack_name=compute1
+cfn_preinstall=NONE
+cfn_preinstall_args=(NONE)
+cfn_postinstall=NONE
+cfn_postinstall_args=(NONE)
+cfn_region=us-east-2
+cfn_scheduler=slurm
+cfn_scheduler_slots=vcpus
+cfn_instance_slots=1
+cfn_ephemeral_dir=/scratch
+cfn_ebs_shared_dirs=/data
+cfn_proxy=NONE
+cfn_node_type=HeadNode
+cfn_cluster_user=ubuntu
+cfn_volume=vol-0b087667ecac188e4[2022-06-30T15:08:55+00:00] INFO: template[/etc/parallelcluster/cfnconfig] mode changed to 644
- change mode from '' to '0644'
* link[/opt/parallelcluster/cfnconfig] action create[2022-06-30T15:08:55+00:00] INFO: Processing link[/opt/parallelcluster/cfnconfig] action create (aws-parallelcluster-config::init line 44)
[2022-06-30T15:08:55+00:00] INFO: link[/opt/parallelcluster/cfnconfig] created
- create symlink at /opt/parallelcluster/cfnconfig to /etc/parallelcluster/cfnconfig
* template[/opt/parallelcluster/scripts/fetch_and_run] action create[2022-06-30T15:08:55+00:00] INFO: Processing template[/opt/parallelcluster/scripts/fetch_and_run] action create (aws-parallelcluster-config::init line 48)
[2022-06-30T15:08:55+00:00] INFO: template[/opt/parallelcluster/scripts/fetch_and_run] created file /opt/parallelcluster/scripts/fetch_and_run
- create new file /opt/parallelcluster/scripts/fetch_and_run[2022-06-30T15:08:55+00:00] INFO: template[/opt/parallelcluster/scripts/fetch_and_run] updated file contents /opt/parallelcluster/scripts/fetch_and_run
- update content in file /opt/parallelcluster/scripts/fetch_and_run from none to 1eb47f
--- /opt/parallelcluster/scripts/fetch_and_run 2022-06-30 15:08:55.666447442 +0000
+++ /opt/parallelcluster/scripts/.chef-fetch_and_run20220630-1116-zss57f 2022-06-30 15:08:55.666447442 +0000
@@ -1 +1,73 @@
+#!/bin/bash
+
+cfnconfig_file="/etc/parallelcluster/cfnconfig"
+. ${cfnconfig_file}
+
+# Check expected variables from cfnconfig file
+function check_params () {
+ if [ -z "${cfn_region}" ] || [ -z "${cfn_preinstall}" ] || [ -z "${cfn_preinstall_args}" ] || [ -z "${cfn_postinstall}" ] || [ -z "${cfn_postinstall_args}" ]; then
+ error_exit "One or more required variables from ${cfnconfig_file} file are undefined"
+ fi
+}
+
+# Error exit function
+function error_exit () {
+ script=`basename $0`
+ echo "parallelcluster: ${script} - $1"
+ logger -t parallelcluster "${script} - $1"
+ exit 1
+}
+
+function download_run (){
+ url=$1
+ shift
+ scheme=$(echo "${url}"| cut -d: -f1)
+ tmpfile=$(mktemp)
+ trap "/bin/rm -f $tmpfile" RETURN
+ if [ "${scheme}" == "s3" ]; then
+ /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/aws --region ${cfn_region} s3 cp ${url} - > $tmpfile || return 1
+ else
+ wget -qO- ${url} > $tmpfile || return 1
+ fi
+ chmod +x $tmpfile || return 1
+ $tmpfile "$@" || error_exit "Failed to run ${ACTION}, ${file} failed with non 0 return code: $?"
+}
+
+function run_preinstall () {
+ if [ "${cfn_preinstall}" != "NONE" ]; then
+ file="${cfn_preinstall}"
+ if [ "${cfn_preinstall_args}" != "NONE" ]; then
+ download_run ${cfn_preinstall} "${cfn_preinstall_args[@]}"
+ else
+ download_run ${cfn_preinstall}
+ fi
+ fi || error_exit "Failed to run preinstall"
+}
+
+function run_postinstall () {
+ RC=0
+ if [ "${cfn_postinstall}" != "NONE" ]; then
+ file="${cfn_postinstall}"
+ if [ "${cfn_postinstall_args}" != "NONE" ]; then
+ download_run ${cfn_postinstall} "${cfn_postinstall_args[@]}"
+ else
+ download_run ${cfn_postinstall}
+ fi
+ fi || error_exit "Failed to run postinstall"
+}
+
+check_params
+
+ACTION=${1#?}
+case ${ACTION} in
+ preinstall)
+ run_preinstall
+ ;;
+ postinstall)
+ run_postinstall
+ ;;
+ *)
+ echo "Unknown action. Exit gracefully"
+ exit 0
+esac[2022-06-30T15:08:55+00:00] INFO: template[/opt/parallelcluster/scripts/fetch_and_run] owner changed to 0
[2022-06-30T15:08:55+00:00] INFO: template[/opt/parallelcluster/scripts/fetch_and_run] group changed to 0
[2022-06-30T15:08:55+00:00] INFO: template[/opt/parallelcluster/scripts/fetch_and_run] mode changed to 755
- change mode from '' to '0755'
- change owner from '' to 'root'
- change group from '' to 'root'
* fetch_config[Fetch and load cluster configs] action run[2022-06-30T15:08:55+00:00] INFO: Processing fetch_config[Fetch and load cluster configs] action run (aws-parallelcluster-config::init line 57)
* execute[copy_cluster_config_from_s3] action run[2022-06-30T15:08:55+00:00] INFO: Processing execute[copy_cluster_config_from_s3] action run (/etc/chef/local-mode-cache/cache/cookbooks/aws-parallelcluster-config/resources/fetch_config.rb line 39)
[execute] {
"AcceptRanges": "bytes",
"LastModified": "Thu, 30 Jun 2022 15:03:57 GMT",
"ContentLength": 2393,
"ETag": "\"f46e93f2e1d20766b21d01c749cd9024\"",
"VersionId": "ejW58vArA8P12goCpkK88TZUjFZNFL3z",
"ContentType": "binary/octet-stream",
"ServerSideEncryption": "AES256",
"Metadata": {}
}
[2022-06-30T15:08:58+00:00] INFO: execute[copy_cluster_config_from_s3] ran successfully
- execute /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/aws s3api get-object --bucket parallelcluster-2e5bf78a0b005f30-v1-do-not-delete --key parallelcluster/3.1.4/clusters/compute1-9k2cwodproy9ug0m/configs/cluster-config-with-implied-values.yaml --region us-east-2 /opt/parallelcluster/shared/cluster-config.yaml --version-id ejW58vArA8P12goCpkK88TZUjFZNFL3z
* execute[copy_instance_type_data_from_s3] action run[2022-06-30T15:08:58+00:00] INFO: Processing execute[copy_instance_type_data_from_s3] action run (/etc/chef/local-mode-cache/cache/cookbooks/aws-parallelcluster-config/resources/fetch_config.rb line 53)
[execute] {
"AcceptRanges": "bytes",
"LastModified": "Thu, 30 Jun 2022 15:04:04 GMT",
"ContentLength": 2894,
"ETag": "\"fa59c29b20615569e2fb5182bb7fc4d9\"",
"VersionId": "ZCV8h_S0EgcsxtoIbRgX5frTWNP6Ai1x",
"ContentType": "binary/octet-stream",
"ServerSideEncryption": "AES256",
"Metadata": {}
}
[2022-06-30T15:08:59+00:00] INFO: execute[copy_instance_type_data_from_s3] ran successfully
- execute /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/aws s3api get-object --bucket parallelcluster-2e5bf78a0b005f30-v1-do-not-delete --key parallelcluster/3.1.4/clusters/compute1-9k2cwodproy9ug0m/configs/instance-types-data.json --region us-east-2 /opt/parallelcluster/shared/instance-types-data.json
* ruby_block[load cluster configuration] action run[2022-06-30T15:08:59+00:00] INFO: Processing ruby_block[load cluster configuration] action run (/etc/chef/local-mode-cache/cache/cookbooks/aws-parallelcluster/libraries/helpers.rb line 537)
[2022-06-30T15:08:59+00:00] INFO: ruby_block[load cluster configuration] called
- execute the ruby block load cluster configuration
Recipe: aws-parallelcluster-config::cloudwatch_agent
* cookbook_file[write_cloudwatch_agent_json.py] action create[2022-06-30T15:08:59+00:00] INFO: Processing cookbook_file[write_cloudwatch_agent_json.py] action create (aws-parallelcluster-config::cloudwatch_agent line 19)
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[write_cloudwatch_agent_json.py] created file /usr/local/bin/write_cloudwatch_agent_json.py
- create new file /usr/local/bin/write_cloudwatch_agent_json.py[2022-06-30T15:08:59+00:00] INFO: cookbook_file[write_cloudwatch_agent_json.py] updated file contents /usr/local/bin/write_cloudwatch_agent_json.py
- update content in file /usr/local/bin/write_cloudwatch_agent_json.py from none to 056bb9
--- /usr/local/bin/write_cloudwatch_agent_json.py 2022-06-30 15:08:59.290536486 +0000
+++ /usr/local/bin/.chef-write_cloudwatch_agent_json20220630-1116-4ghn0n.py 2022-06-30 15:08:59.286536388 +0000
@@ -1 +1,217 @@
+#!/usr/bin/env python
+"""
+Write the CloudWatch agent configuration file.
+
+Write the JSON used to configure the CloudWatch agent on an instance conditional
+on the scheduler to be used, the platform (OS family) in use and the instance's role in the cluster.
+"""
+
+import argparse
+import json
+import os
+import socket
+
+import yaml
+
+AWS_CLOUDWATCH_CFG_PATH = "/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json"
+
+
+def parse_args():
+ """Parse CL args and return an argparse.Namespace."""
+ parser = argparse.ArgumentParser(description="Create the cloudwatch agent config file")
+ parser.add_argument("--config", help="Path to JSON file describing logs that should be monitored", required=True)
+ parser.add_argument(
+ "--platform", help="OS family of this instance", choices=["amazon", "centos", "ubuntu"], required=True
+ )
+ parser.add_argument("--log-group", help="Name of the log group", required=True)
+ parser.add_argument(
+ "--node-role",
+ required=True,
+ choices=["HeadNode", "ComputeFleet"],
+ help="Role this node plays in the cluster " "(i.e., is it a compute node or the head node?)",
+ )
+ parser.add_argument("--scheduler", required=True, choices=["slurm", "awsbatch", "plugin"], help="Scheduler")
+ parser.add_argument(
+ "--cluster-config-path",
+ required=False,
+ help="Cluster configuration path",
+ )
+ return parser.parse_args()
+
+
+def gethostname():
+ """Return hostname of this instance."""
+ return socket.gethostname().split(".")[0]
+
+
+def write_config(config):
+ """Write config to AWS_CLOUDWATCH_CFG_PATH."""
+ with open(AWS_CLOUDWATCH_CFG_PATH, "w+") as output_config_file:
+ json.dump(config, output_config_file, indent=4)
+
+
+def add_log_group_name_params(log_group_name, configs):
+ """Add a "log_group_name": log_group_name to every config."""
+ for config in configs:
+ config.update({"log_group_name": log_group_name})
+ return configs
+
+
+def add_instance_log_stream_prefixes(configs):
+ """Prefix all log_stream_name fields with instance identifiers."""
+ for config in configs:
+ config["log_stream_name"] = "{host}.{{instance_id}}.{log_stream_name}".format(
+ host=gethostname(), log_stream_name=config["log_stream_name"]
+ )
+ return configs
+
+
+def read_data(config_path):
+ """Read in log configuration data from config_path."""
+ with open(config_path) as infile:
+ return json.load(infile)
+
+
+def select_configs_for_scheduler(configs, scheduler):
+ """Filter out from configs those entries whose 'schedulers' list does not contain scheduler."""
+ return [config for config in configs if scheduler in config["schedulers"]]
+
+
+def select_configs_for_node_role(configs, node_role):
+ """Filter out from configs those entries whose 'node_roles' list does not contain node_role."""
+ return [config for config in configs if node_role in config["node_roles"]]
+
+
+def select_configs_for_platform(configs, platform):
+ """Filter out from configs those entries whose 'platforms' list does not contain platform."""
+ return [config for config in configs if platform in config["platforms"]]
+
+
+def get_node_info():
+ """Return the information encoded in the JSON file at /etc/chef/dna.json."""
+ node_info = {}
+ dna_path = "/etc/chef/dna.json"
+ if os.path.isfile(dna_path):
+ with open(dna_path) as node_info_file:
+ node_info = json.load(node_info_file).get("cluster")
+ return node_info
+
+
+def select_configs_for_feature(configs):
+ """Filter out from configs those entries whose 'feature_conditions' list contains an unsatisfied entry."""
+ selected_configs = []
+ node_info = get_node_info()
+ for config in configs:
+ conditions = config.get("feature_conditions", [])
+ for condition in conditions:
+ dna_keys = condition.get("dna_key")
+ if isinstance(dna_keys, str): # dna_key can be a string for single level dict or a list for nested dicts
+ dna_keys = [dna_keys]
+ value = node_info
+ for key in dna_keys:
+ value = value.get(key)
+ if value is None:
+ break
+ if value not in condition.get("satisfying_values"):
+ break
+ else:
+ selected_configs.append(config)
+ return selected_configs
+
+
+def select_logs(configs, args):
+ """Select the appropriate set of log configs."""
+ selected_configs = select_configs_for_scheduler(configs, args.scheduler)
+ selected_configs = select_configs_for_node_role(selected_configs, args.node_role)
+ selected_configs = select_configs_for_platform(selected_configs, args.platform)
+ selected_configs = select_configs_for_feature(selected_configs)
+ return selected_configs
+
+
+def get_node_roles(scheudler_plugin_node_roles):
+ node_type_roles_map = {"ALL": ["ComputeFleet", "HeadNode"], "HEAD": ["HeadNode"], "COMPUTE": ["ComputeFleet"]}
+ return node_type_roles_map.get(scheudler_plugin_node_roles)
+
+
+def load_config(cluster_config_path):
+ with open(cluster_config_path) as input_file:
+ return yaml.load(input_file, Loader=yaml.SafeLoader)
+
+
+def add_scheduler_plugin_log(config_data, cluster_config_path):
+ """Add custom log files to config data if log files specified in scheduler plugin."""
+ cluster_config = load_config(cluster_config_path)
+ if (
+ get_dict_value(cluster_config, "Scheduling.SchedulerSettings.SchedulerDefinition.Monitoring.Logs.Files")
+ and get_dict_value(cluster_config, "Scheduling.Scheduler") == "plugin"
+ ):
+ log_files = get_dict_value(
+ cluster_config, "Scheduling.SchedulerSettings.SchedulerDefinition.Monitoring.Logs.Files"
+ )
+ for log_file in log_files:
+ # Add log config
+ log_config = {
+ "timestamp_format_key": log_file.get("LogStreamName"),
+ "file_path": log_file.get("FilePath"),
+ "log_stream_name": log_file.get("LogStreamName"),
+ "schedulers": ["plugin"],
+ "platforms": ["centos", "ubuntu", "amazon"],
+ "node_roles": get_node_roles(log_file.get("NodeType")),
+ "feature_conditions": [],
+ }
+ config_data["log_configs"].append(log_config)
+
+ # Add timestamp formats
+ config_data["timestamp_formats"][log_file.get("LogStreamName")] = log_file.get("TimestampFormat")
+ return config_data
+
+
+def add_timestamps(configs, timestamps_dict):
+ """For each config, set its timestamp_format field based on its timestamp_format_key field."""
+ for config in configs:
+ config["timestamp_format"] = timestamps_dict[config["timestamp_format_key"]]
+ return configs
+
+
+def filter_output_fields(configs):
+ """Remove fields that are not required by CloudWatch agent config file."""
+ desired_keys = ["log_stream_name", "file_path", "timestamp_format", "log_group_name"]
+ return [{desired_key: config[desired_key] for desired_key in desired_keys} for config in configs]
+
+
+def create_config(log_configs):
+ """Return a dict representing the structure of the output JSON."""
+ return {
+ "logs": {
+ "logs_collected": {"files": {"collect_list": log_configs}},
+ "log_stream_name": "{host}.{{instance_id}}.default-log-stream".format(host=gethostname()),
+ }
+ }
+
+
+def get_dict_value(value, attributes, default=None):
+ """Get key value from dictionary and return default if the key does not exist."""
+ for key in attributes.split("."):
+ value = value.get(key, None)
+ if value is None:
+ return default
+ return value
+
+
+def main():
+ """Create cloudwatch agent config file."""
+ args = parse_args()
+ config_data = read_data(args.config)
+ if args.cluster_config_path:
+ config_data = add_scheduler_plugin_log(config_data, args.cluster_config_path)
+ log_configs = select_logs(config_data["log_configs"], args)
+ log_configs = add_timestamps(log_configs, config_data["timestamp_formats"])
+ log_configs = add_log_group_name_params(args.log_group, log_configs)
+ log_configs = add_instance_log_stream_prefixes(log_configs)
+ log_configs = filter_output_fields(log_configs)
+ write_config(create_config(log_configs))
+
+
+if __name__ == "__main__":
+ main()[2022-06-30T15:08:59+00:00] INFO: cookbook_file[write_cloudwatch_agent_json.py] owner changed to 0
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[write_cloudwatch_agent_json.py] group changed to 0
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[write_cloudwatch_agent_json.py] mode changed to 755
- change mode from '' to '0755'
- change owner from '' to 'root'
- change group from '' to 'root'
* cookbook_file[cloudwatch_log_files.json] action create[2022-06-30T15:08:59+00:00] INFO: Processing cookbook_file[cloudwatch_log_files.json] action create (aws-parallelcluster-config::cloudwatch_agent line 29)
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files.json] created file /usr/local/etc/cloudwatch_log_files.json
- create new file /usr/local/etc/cloudwatch_log_files.json[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files.json] updated file contents /usr/local/etc/cloudwatch_log_files.json
- update content in file /usr/local/etc/cloudwatch_log_files.json from none to 146300
--- /usr/local/etc/cloudwatch_log_files.json 2022-06-30 15:08:59.338537669 +0000
+++ /usr/local/etc/.chef-cloudwatch_log_files20220630-1116-q8skll.json 2022-06-30 15:08:59.338537669 +0000
@@ -1 +1,550 @@
+{
+ "timestamp_formats": {
+ "month_first": "%b %-d %H:%M:%S",
+ "default": "%Y-%m-%d %H:%M:%S,%f",
+ "bracket_default": "[%Y-%m-%d %H:%M:%S]",
+ "slurm": "%Y-%m-%dT%H:%M:%S.%f"
+ },
+ "log_configs": [
+ {
+ "timestamp_format_key": "month_first",
+ "file_path": "/var/log/messages",
+ "log_stream_name": "system-messages",
+ "schedulers": [
+ "awsbatch",
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "amazon",
+ "centos"
+ ],
+ "node_roles": [
+ "ComputeFleet",
+ "HeadNode"
+ ],
+ "feature_conditions": []
+ },
+ {
+ "timestamp_format_key": "month_first",
+ "file_path": "/var/log/syslog",
+ "log_stream_name": "syslog",
+ "schedulers": [
+ "awsbatch",
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "ubuntu"
+ ],
+ "node_roles": [
+ "ComputeFleet",
+ "HeadNode"
+ ],
+ "feature_conditions": []
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/cfn-init.log",
+ "log_stream_name": "cfn-init",
+ "schedulers": [
+ "awsbatch",
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "amazon",
+ "centos",
+ "ubuntu"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": []
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/chef-client.log",
+ "log_stream_name": "chef-client",
+ "schedulers": [
+ "awsbatch",
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "amazon",
+ "centos",
+ "ubuntu"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": []
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/cloud-init.log",
+ "log_stream_name": "cloud-init",
+ "schedulers": [
+ "awsbatch",
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "amazon",
+ "centos",
+ "ubuntu"
+ ],
+ "node_roles": [
+ "ComputeFleet",
+ "HeadNode"
+ ],
+ "feature_conditions": []
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/cloud-init-output.log",
+ "log_stream_name": "cloud-init-output",
+ "schedulers": [
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "amazon",
+ "centos",
+ "ubuntu"
+ ],
+ "node_roles": [
+ "ComputeFleet"
+ ],
+ "feature_conditions": []
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/supervisord.log",
+ "log_stream_name": "supervisord",
+ "schedulers": [
+ "awsbatch",
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "amazon",
+ "centos",
+ "ubuntu"
+ ],
+ "node_roles": [
+ "ComputeFleet",
+ "HeadNode"
+ ],
+ "feature_conditions": []
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/parallelcluster/clustermgtd",
+ "log_stream_name": "clustermgtd",
+ "schedulers": [
+ "slurm"
+ ],
+ "platforms": [
+ "amazon",
+ "centos",
+ "ubuntu"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": []
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/parallelcluster/computemgtd",
+ "log_stream_name": "computemgtd",
+ "schedulers": [
+ "slurm"
+ ],
+ "platforms": [
+ "amazon",
+ "centos",
+ "ubuntu"
+ ],
+ "node_roles": [
+ "ComputeFleet"
+ ],
+ "feature_conditions": []
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/parallelcluster/slurm_resume.log",
+ "log_stream_name": "slurm_resume",
+ "schedulers": [
+ "slurm"
+ ],
+ "platforms": [
+ "amazon",
+ "centos",
+ "ubuntu"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": []
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/parallelcluster/slurm_suspend.log",
+ "log_stream_name": "slurm_suspend",
+ "schedulers": [
+ "slurm"
+ ],
+ "platforms": [
+ "amazon",
+ "centos",
+ "ubuntu"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": []
+ },
+ {
+ "timestamp_format_key": "slurm",
+ "file_path": "/var/log/slurmd.log",
+ "log_stream_name": "slurmd",
+ "schedulers": [
+ "slurm"
+ ],
+ "platforms": [
+ "amazon",
+ "centos",
+ "ubuntu"
+ ],
+ "node_roles": [
+ "ComputeFleet"
+ ],
+ "feature_conditions": []
+ },
+ {
+ "timestamp_format_key": "slurm",
+ "file_path": "/var/log/slurmctld.log",
+ "log_stream_name": "slurmctld",
+ "schedulers": [
+ "slurm"
+ ],
+ "platforms": [
+ "amazon",
+ "centos",
+ "ubuntu"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": []
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/parallelcluster/pcluster_dcv_authenticator.log",
+ "log_stream_name": "dcv-authenticator",
+ "schedulers": [
+ "awsbatch",
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "centos",
+ "ubuntu",
+ "amazon"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": [
+ {
+ "dna_key": "dcv_enabled",
+ "satisfying_values": ["head_node"]
+ }
+ ]
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/sssd/sssd.log",
+ "log_stream_name": "sssd",
+ "schedulers": [
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "centos",
+ "ubuntu",
+ "amazon"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": [
+ {
+ "dna_key": [
+ "directory_service",
+ "enabled"
+ ],
+ "satisfying_values": ["true"]
+ }
+ ]
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/sssd/sssd_default.log",
+ "log_stream_name": "sssd_domain_default",
+ "schedulers": [
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "centos",
+ "ubuntu",
+ "amazon"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": [
+ {
+ "dna_key": [
+ "directory_service",
+ "enabled"
+ ],
+ "satisfying_values": ["true"]
+ }
+ ]
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/parallelcluster/pam_ssh_key_generator.log",
+ "log_stream_name": "pam_ssh_key_generator",
+ "schedulers": [
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "centos",
+ "ubuntu",
+ "amazon"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": [
+ {
+ "dna_key": [
+ "directory_service",
+ "generate_ssh_keys_for_users"
+ ],
+ "satisfying_values": ["true"]
+ }
+ ]
+ },
+ {
+ "timestamp_format_key": "bracket_default",
+ "file_path": "/var/log/parallelcluster/pcluster_dcv_connect.log",
+ "log_stream_name": "dcv-ext-authenticator",
+ "schedulers": [
+ "awsbatch",
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "centos",
+ "ubuntu",
+ "amazon"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": [
+ {
+ "dna_key": "dcv_enabled",
+ "satisfying_values": ["head_node"]
+ }
+ ]
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/dcv/server.log",
+ "log_stream_name": "dcv-server",
+ "schedulers": [
+ "awsbatch",
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "centos",
+ "ubuntu",
+ "amazon"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": [
+ {
+ "dna_key": "dcv_enabled",
+ "satisfying_values": ["head_node"]
+ }
+ ]
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/dcv/sessionlauncher.log",
+ "log_stream_name": "dcv-session-launcher",
+ "schedulers": [
+ "awsbatch",
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "centos",
+ "ubuntu",
+ "amazon"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": [
+ {
+ "dna_key": "dcv_enabled",
+ "satisfying_values": ["head_node"]
+ }
+ ]
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/dcv/agent.*.log",
+ "log_stream_name": "dcv-agent",
+ "schedulers": [
+ "awsbatch",
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "centos",
+ "ubuntu",
+ "amazon"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": [
+ {
+ "dna_key": "dcv_enabled",
+ "satisfying_values": ["head_node"]
+ }
+ ]
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/dcv/dcv-xsession.*.log",
+ "log_stream_name": "dcv-xsession",
+ "schedulers": [
+ "awsbatch",
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "centos",
+ "ubuntu",
+ "amazon"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": [
+ {
+ "dna_key": "dcv_enabled",
+ "satisfying_values": ["head_node"]
+ }
+ ]
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/dcv/Xdcv.*.log",
+ "log_stream_name": "Xdcv",
+ "schedulers": [
+ "awsbatch",
+ "slurm",
+ "plugin"
+ ],
+ "platforms": [
+ "centos",
+ "ubuntu",
+ "amazon"
+ ],
+ "node_roles": [
+ "HeadNode"
+ ],
+ "feature_conditions": [
+ {
+ "dna_key": "dcv_enabled",
+ "satisfying_values": ["head_node"]
+ }
+ ]
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/parallelcluster/scheduler-plugin.out.log",
+ "log_stream_name": "scheduler-plugin-out",
+ "schedulers": [
+ "plugin"
+ ],
+ "platforms": [
+ "centos",
+ "ubuntu",
+ "amazon"
+ ],
+ "node_roles": [
+ "ComputeFleet",
+ "HeadNode"
+ ],
+ "feature_conditions": []
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/parallelcluster/scheduler-plugin.err.log",
+ "log_stream_name": "scheduler-plugin-err",
+ "schedulers": [
+ "plugin"
+ ],
+ "platforms": [
+ "centos",
+ "ubuntu",
+ "amazon"
+ ],
+ "node_roles": [
+ "ComputeFleet",
+ "HeadNode"
+ ],
+ "feature_conditions": []
+ },
+ {
+ "timestamp_format_key": "default",
+ "file_path": "/var/log/parallelcluster/slurm_prolog_epilog.log",
+ "log_stream_name": "slurm_prolog_epilog",
+ "schedulers": [
+ "slurm"
+ ],
+ "platforms": [
+ "centos",
+ "ubuntu",
+ "amazon"
+ ],
+ "node_roles": [
+ "ComputeFleet"
+ ],
+ "feature_conditions": [
+ {
+ "dna_key": "use_private_hostname",
+ "satisfying_values": ["true"]
+ }
+ ]
+ }
+ ]
+}[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files.json] owner changed to 0
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files.json] group changed to 0
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files.json] mode changed to 644
- change mode from '' to '0644'
- change owner from '' to 'root'
- change group from '' to 'root'
* cookbook_file[cloudwatch_log_files_schema.json] action create[2022-06-30T15:08:59+00:00] INFO: Processing cookbook_file[cloudwatch_log_files_schema.json] action create (aws-parallelcluster-config::cloudwatch_agent line 39)
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files_schema.json] created file /usr/local/etc/cloudwatch_log_files_schema.json
- create new file /usr/local/etc/cloudwatch_log_files_schema.json[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files_schema.json] updated file contents /usr/local/etc/cloudwatch_log_files_schema.json
- update content in file /usr/local/etc/cloudwatch_log_files_schema.json from none to 0a6242
--- /usr/local/etc/cloudwatch_log_files_schema.json 2022-06-30 15:08:59.406539343 +0000
+++ /usr/local/etc/.chef-cloudwatch_log_files_schema20220630-1116-hhjy0i.json 2022-06-30 15:08:59.406539343 +0000
@@ -1 +1,51 @@
+{
+ "type": "object",
+ "properties": {
+ "timestamp_formats": {"type": "object"},
+ "log_configs": {
+ "type": "array",
+ "items": {
+ "type": "object",
+ "properties": {
+ "timestamp_format_key": {"type": "string"},
+ "file_path": {"type": "string"},
+ "log_stream_name": {"type": "string"},
+ "schedulers": {
+ "type": "array",
+ "items": {"type": "string", "enum": ["awsbatch", "slurm", "plugin"]}
+ },
+ "platforms": {
+ "type": "array",
+ "items": {"type": "string", "enum": ["amazon", "centos", "ubuntu"]}
+ },
+ "node_roles": {
+ "type": "array",
+ "items": {"type": "string", "enum": ["HeadNode", "ComputeFleet"]}
+ },
+ "feature_conditions": {
+ "type": "array",
+ "items": {
+ "type": "object",
+ "properties": {
+ "dna_key": {"type": ["string", "array"]},
+ "satisfying_values": {"type": "array", "items": {"type": "string"}}
+ },
+ "required": ["dna_key", "satisfying_values"]
+ }
+ }
+ },
+ "required": [
+ "node_roles",
+ "platforms",
+ "schedulers",
+ "log_stream_name",
+ "file_path",
+ "timestamp_format_key",
+ "feature_conditions"
+ ]
+ }
+ }
+ },
+ "required": ["timestamp_formats", "log_configs"]
+}[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files_schema.json] owner changed to 0
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files_schema.json] group changed to 0
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files_schema.json] mode changed to 644
- change mode from '' to '0644'
- change owner from '' to 'root'
- change group from '' to 'root'
* cookbook_file[cloudwatch_log_configs_util.py] action create[2022-06-30T15:08:59+00:00] INFO: Processing cookbook_file[cloudwatch_log_configs_util.py] action create (aws-parallelcluster-config::cloudwatch_agent line 49)
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_configs_util.py] created file /usr/local/bin/cloudwatch_log_configs_util.py
- create new file /usr/local/bin/cloudwatch_log_configs_util.py[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_configs_util.py] updated file contents /usr/local/bin/cloudwatch_log_configs_util.py
- update content in file /usr/local/bin/cloudwatch_log_configs_util.py from none to bc8f92
--- /usr/local/bin/cloudwatch_log_configs_util.py 2022-06-30 15:08:59.454540525 +0000
+++ /usr/local/bin/.chef-cloudwatch_log_configs_util20220630-1116-h3x11d.py 2022-06-30 15:08:59.454540525 +0000
@@ -1 +1,189 @@
+"""
+Validate and modify the data in the cloudwatch_log_files.json cookbook file.
+
+This file is used to validate and add data to the JSON file that's used to
+configure the CloudWatch agent on a cluster's EC2 instances. The structure of
+the new and/or existing data is validated in the following ways:
+* jsonschema is used to ensure that the input and output configs both possess
+ a valid structure. See cloudwatch_log_files_schema.json for the schema.
+* For each log_configs entry, it's verified that its timestamp_key is a valid
+ key into the same config file's timestamp_formats object.
+* It's verified that all log_configs entries have unique values for their
+ log_stream_name and file_path attributes.
+"""
+
+import argparse
+import collections
+import json
+import os
+import shutil
+import sys
+
+import jsonschema
+
+DEFAULT_SCHEMA_PATH = os.path.realpath(os.path.join(os.path.curdir, "cloudwatch_log_files_schema.json"))
+SCHEMA_PATH = os.environ.get("CW_LOGS_CONFIGS_SCHEMA_PATH", DEFAULT_SCHEMA_PATH)
+DEFAULT_LOG_CONFIGS_PATH = os.path.realpath(os.path.join(os.path.curdir, "cloudwatch_log_files.json"))
+LOG_CONFIGS_PATH = os.environ.get("CW_LOGS_CONFIGS_PATH", DEFAULT_LOG_CONFIGS_PATH)
+LOG_CONFIGS_BAK_PATH = "{}.bak".format(LOG_CONFIGS_PATH)
+
+
+def _fail(message):
+ """Exit nonzero with the given error message."""
+ sys.exit(message)
+
+
+def parse_args():
+ """Parse command line args."""
+ parser = argparse.ArgumentParser(
+ description="Validate of add new CloudWatch log configs.",
+ epilog="If neither --input-json nor --input-file are used, this script will validate the existing config.",
+ )
+ add_group = parser.add_mutually_exclusive_group()
+ add_group.add_argument(
+ "--input-file", type=argparse.FileType("r"), help="Path to file containing configs for log files to add."
+ )
+ add_group.add_argument("--input-json", type=json.loads, help="String containing configs for log files to add.")
+ return parser.parse_args()
+
+
+def get_input_json(args):
+ """Either load the input JSON data from a file, or returned the JSON parsed on the CLI."""
+ if args.input_file:
+ with args.input_file:
+ return json.load(args.input_file)
+ else:
+ return args.input_json
+
+
+def _read_json_at(path):
+ """Read the JSON file at path."""
+ try:
+ with open(path) as input_file:
+ return json.load(input_file)
+ except FileNotFoundError:
+ _fail("No file exists at {}".format(path))
+ except ValueError:
+ _fail("File at {} contains invalid JSON".format(path))
+
+
+def _read_schema():
+ """Read the schema for the CloudWatch log configs file."""
+ return _read_json_at(SCHEMA_PATH)
+
+
+def _read_log_configs():
+ """Read the current version of the CloudWatch log configs file, cloudwatch_log_files.json."""
+ return _read_json_at(LOG_CONFIGS_PATH)
+
+
+def _validate_json_schema(input_json):
+ """Ensure the structure of input_json matches the schema."""
+ schema = _read_schema()
+ try:
+ jsonschema.validate(input_json, schema)
+ except jsonschema.exceptions.ValidationError as validation_err:
+ _fail(str(validation_err))
+
+
+def _validate_timestamp_keys(input_json):
+ """Ensure the timestamp_format_key values in input_json's log_configs entries are valid."""
+ valid_keys = set()
+ for config in (input_json, _read_log_configs()):
+ valid_keys |= set(config.get("timestamp_formats").keys())
+ for log_config in input_json.get("log_configs"):
+ if log_config.get("timestamp_format_key") not in valid_keys:
+ _fail(
+ "Log config with log_stream_name {log_stream_name} and file_path {file_path} contains an invalid "
+ "timestamp_format_key: {timestamp_format_key}. Valid values are {valid_keys}".format(
+ log_stream_name=log_config.get("log_stream_name"),
+ file_path=log_config.get("file_path"),
+ timestamp_format_key=log_config.get("timestamp_format_key"),
+ valid_keys=", ".join(valid_keys),
+ )
+ )
+
+
+def _get_duplicate_values(seq):
+ """Get the duplicate values in seq."""
+ counter = collections.Counter(seq)
+ return [value for value, count in counter.items() if count > 1]
+
+
+def _validate_log_config_fields_uniqueness(input_json):
+ """Ensure that each entry in input_json's log_configs list has a unique log_stream_name and file_path."""
+ unique_fields = ("log_stream_name", "file_path")
+ for field in unique_fields:
+ duplicates = _get_duplicate_values([config.get(field) for config in input_json.get("log_configs")])
+ if duplicates:
+ _fail(
+ "The following {field} values are used multiple times: {duplicates}".format(
+ field=field, duplicates=", ".join(duplicates)
+ )
+ )
+
+
+def validate_json(input_json=None):
+ """Ensure the structure of input_json matches that of the file it will be added to."""
+ if input_json is None:
+ input_json = _read_log_configs()
+ _validate_json_schema(input_json)
+ _validate_timestamp_keys(input_json)
+ _validate_log_config_fields_uniqueness(input_json)
+
+
+def _write_log_configs(log_configs):
+ """Write log_configs back to the CloudWatch log configs file."""
+ log_configs_path = os.environ.get("CW_LOGS_CONFIGS_PATH", DEFAULT_LOG_CONFIGS_PATH)
+ with open(log_configs_path, "w") as log_configs_file:
+ json.dump(log_configs, log_configs_file, indent=2)
+
+
+def write_validated_json(input_json):
+ """Write validated JSON back to the CloudWatch log configs file."""
+ log_configs = _read_log_configs()
+ log_configs["log_configs"].extend(input_json.get("log_configs"))
+
+ # NOTICE: the input JSON's timestamp_formats dict is the one that is
+ # updated, so that those defined in the original config aren't clobbered.
+ log_configs["timestamp_formats"] = input_json["timestamp_formats"].update(log_configs.get("timestamp_formats"))
+ _write_log_configs(log_configs)
+
+
+def create_backup():
+ """Create a backup of the file at LOG_CONFIGS_PATH."""
+ shutil.copyfile(LOG_CONFIGS_PATH, LOG_CONFIGS_BAK_PATH)
+
+
+def restore_backup():
+ """Replace the file at LOG_CONFIGS_PATH with the backup that was created in create_backup."""
+ shutil.move(LOG_CONFIGS_BAK_PATH, LOG_CONFIGS_PATH)
+
+
+def remove_backup():
+ """Remove the backup created by create_backup."""
+ try:
+ os.remove(LOG_CONFIGS_BAK_PATH)
+ except FileNotFoundError:
+ pass
+
+
+def main():
+ """Run the script."""
+ args = parse_args()
+ create_backup()
+ try:
+ if args.input_file or args.input_json:
+ input_json = get_input_json(args)
+ validate_json(input_json)
+ write_validated_json(input_json)
+ validate_json()
+ except Exception:
+ restore_backup()
+ finally:
+ remove_backup()
+
+
+if __name__ == "__main__":
+ main()[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_configs_util.py] owner changed to 0
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_configs_util.py] group changed to 0
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_configs_util.py] mode changed to 644
- change mode from '' to '0644'
- change owner from '' to 'root'
- change group from '' to 'root'
* execute[cloudwatch-config-validation] action run[2022-06-30T15:08:59+00:00] INFO: Processing execute[cloudwatch-config-validation] action run (aws-parallelcluster-config::cloudwatch_agent line 58)
[2022-06-30T15:09:00+00:00] INFO: execute[cloudwatch-config-validation] ran successfully
- execute /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/bin/cloudwatch_log_configs_util.py
* execute[cloudwatch-config-creation] action run[2022-06-30T15:09:00+00:00] INFO: Processing execute[cloudwatch-config-creation] action run (aws-parallelcluster-config::cloudwatch_agent line 67)
[2022-06-30T15:09:00+00:00] INFO: execute[cloudwatch-config-creation] ran successfully
- execute /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/bin/write_cloudwatch_agent_json.py --platform ubuntu --config $CONFIG_DATA_PATH --log-group $LOG_GROUP_NAME --scheduler $SCHEDULER --node-role $NODE_ROLE
* execute[cloudwatch-agent-start] action run[2022-06-30T15:09:00+00:00] INFO: Processing execute[cloudwatch-agent-start] action run (aws-parallelcluster-config::cloudwatch_agent line 84)
[execute] ****** processing amazon-cloudwatch-agent ******
/opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source file:/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default
2022/06/30 15:09:01 D! [EC2] Found active network interface
Successfully fetched the config and saved in /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_amazon-cloudwatch-agent.json.tmp
Start configuration validation...
/opt/aws/amazon-cloudwatch-agent/bin/config-translator --input /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json --input-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --output /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default
2022/06/30 15:09:03 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_amazon-cloudwatch-agent.json.tmp ...
2022/06/30 15:09:03 I! Valid Json input schema.
I! Detecting run_as_user...
2022/06/30 15:09:03 D! [EC2] Found active network interface
No csm configuration found.
No metric configuration found.
Configuration validation first phase succeeded
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -schematest -config /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml
Configuration validation second phase succeeded
Configuration validation succeeded
amazon-cloudwatch-agent has already been stopped
Created symlink /etc/systemd/system/multi-user.target.wants/amazon-cloudwatch-agent.service → /etc/systemd/system/amazon-cloudwatch-agent.service.
[2022-06-30T15:09:11+00:00] INFO: execute[cloudwatch-agent-start] ran successfully
- execute /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json -s
Recipe: aws-parallelcluster-config::network_interfaces
* log[macs: ["02:4c:91:fd:c4:f4"]] action write[2022-06-30T15:09:11+00:00] INFO: Processing log[macs: ["02:4c:91:fd:c4:f4"]] action write (aws-parallelcluster-config::network_interfaces line 63)
[2022-06-30T15:09:11+00:00] INFO: macs: ["02:4c:91:fd:c4:f4"]
Recipe: aws-parallelcluster-slurm::init
* directory[/etc/parallelcluster/slurm_plugin] action create[2022-06-30T15:09:11+00:00] INFO: Processing directory[/etc/parallelcluster/slurm_plugin] action create (aws-parallelcluster-slurm::init line 20)
[2022-06-30T15:09:11+00:00] INFO: directory[/etc/parallelcluster/slurm_plugin] created directory /etc/parallelcluster/slurm_plugin
- create new directory /etc/parallelcluster/slurm_plugin[2022-06-30T15:09:11+00:00] INFO: directory[/etc/parallelcluster/slurm_plugin] owner changed to 0
[2022-06-30T15:09:11+00:00] INFO: directory[/etc/parallelcluster/slurm_plugin] group changed to 0
[2022-06-30T15:09:11+00:00] INFO: directory[/etc/parallelcluster/slurm_plugin] mode changed to 755
- change mode from '' to '0755'
- change owner from '' to 'root'
- change group from '' to 'root'
Recipe: aws-parallelcluster-slurm::init_dns
* replace_or_add[append Route53 search domain in /etc/systemd/resolved.conf] action edit[2022-06-30T15:09:11+00:00] INFO: Processing replace_or_add[append Route53 search domain in /etc/systemd/resolved.conf] action edit (aws-parallelcluster-slurm::init_dns line 31)
* file[/etc/systemd/resolved.conf] action create[2022-06-30T15:09:11+00:00] INFO: Processing file[/etc/systemd/resolved.conf] action create (/etc/chef/local-mode-cache/cache/cookbooks/line/resources/replace_or_add.rb line 41)
[2022-06-30T15:09:11+00:00] INFO: file[/etc/systemd/resolved.conf] updated file contents /etc/systemd/resolved.conf
- update content in file /etc/systemd/resolved.conf from e12793 to 530e52
- suppressed sensitive resource
* service[systemd-resolved] action restart[2022-06-30T15:09:11+00:00] INFO: Processing service[systemd-resolved] action restart (aws-parallelcluster-slurm::init_dns line 259)
[2022-06-30T15:09:12+00:00] INFO: service[systemd-resolved] restarted
- restart service service[systemd-resolved]
* hostname[set short hostname] action set[2022-06-30T15:09:12+00:00] INFO: Processing hostname[set short hostname] action set (aws-parallelcluster-slurm::init_dns line 85)
* ohai[reload hostname] action nothing[2022-06-30T15:09:12+00:00] INFO: Processing ohai[reload hostname] action nothing (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/hostname.rb line 139)
(skipped due to action :nothing)
* execute[set hostname to ip-10-0-0-21] action run[2022-06-30T15:09:12+00:00] INFO: Processing execute[set hostname to ip-10-0-0-21] action run (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/hostname.rb line 145)
(skipped due to not_if)
* file[/etc/hosts] action create[2022-06-30T15:09:12+00:00] INFO: Processing file[/etc/hosts] action create (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/hostname.rb line 106)
[2022-06-30T15:09:12+00:00] INFO: file[/etc/hosts] backed up to /etc/chef/local-mode-cache/backup/etc/hosts.chef-20220630150912.089438
[2022-06-30T15:09:12+00:00] INFO: file[/etc/hosts] updated file contents /etc/hosts
- update content in file /etc/hosts from aa4ea9 to 9fc279
--- /etc/hosts 2022-05-08 22:10:36.000000000 +0000
+++ /etc/.chef-hosts20220630-1116-co0crc 2022-06-30 15:09:12.086851581 +0000
@@ -7,4 +7,5 @@
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
+10.0.0.21 ip-10-0-0-21 ip-10-0-0-21
* execute[hostnamectl set-hostname ip-10-0-0-21] action run[2022-06-30T15:09:12+00:00] INFO: Processing execute[hostnamectl set-hostname ip-10-0-0-21] action run (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/hostname.rb line 188)
(skipped due to not_if)
[2022-06-30T15:09:12+00:00] INFO: file[/etc/hosts] sending reload action to ohai[reload hostname] (delayed)
* ohai[reload hostname] action reload[2022-06-30T15:09:12+00:00] INFO: Processing ohai[reload hostname] action reload (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/hostname.rb line 139)
[2022-06-30T15:09:12+00:00] INFO: ohai[reload hostname] reloaded
- re-run ohai and merge results into node attributes
* ohai[reload_hostname] action nothing[2022-06-30T15:09:12+00:00] INFO: Processing ohai[reload_hostname] action nothing (aws-parallelcluster-slurm::init_dns line 91)
(skipped due to action :nothing)
* replace_or_add[set fqdn in the /etc/hosts] action edit[2022-06-30T15:09:12+00:00] INFO: Processing replace_or_add[set fqdn in the /etc/hosts] action edit (aws-parallelcluster-slurm::init_dns line 97)
* file[/etc/hosts] action create[2022-06-30T15:09:12+00:00] INFO: Processing file[/etc/hosts] action create (/etc/chef/local-mode-cache/cache/cookbooks/line/resources/replace_or_add.rb line 41)
[2022-06-30T15:09:12+00:00] INFO: file[/etc/hosts] updated file contents /etc/hosts
- update content in file /etc/hosts from 9fc279 to 5f5a85
- suppressed sensitive resource
Recipe: aws-parallelcluster-config::imds
* directory[/opt/parallelcluster/scripts/imds] action create[2022-06-30T15:09:12+00:00] INFO: Processing directory[/opt/parallelcluster/scripts/imds] action create (aws-parallelcluster-config::imds line 23)
[2022-06-30T15:09:12+00:00] INFO: directory[/opt/parallelcluster/scripts/imds] created directory /opt/parallelcluster/scripts/imds
- create new directory /opt/parallelcluster/scripts/imds[2022-06-30T15:09:12+00:00] INFO: directory[/opt/parallelcluster/scripts/imds] owner changed to 0
[2022-06-30T15:09:12+00:00] INFO: directory[/opt/parallelcluster/scripts/imds] group changed to 0
[2022-06-30T15:09:12+00:00] INFO: directory[/opt/parallelcluster/scripts/imds] mode changed to 744
- change mode from '' to '0744'
- change owner from '' to 'root'
- change group from '' to 'root'
* cookbook_file[/opt/parallelcluster/scripts/imds/imds-access.sh] action create[2022-06-30T15:09:12+00:00] INFO: Processing cookbook_file[/opt/parallelcluster/scripts/imds/imds-access.sh] action create (aws-parallelcluster-config::imds line 32)
[2022-06-30T15:09:12+00:00] INFO: cookbook_file[/opt/parallelcluster/scripts/imds/imds-access.sh] created file /opt/parallelcluster/scripts/imds/imds-access.sh
- create new file /opt/parallelcluster/scripts/imds/imds-access.sh[2022-06-30T15:09:12+00:00] INFO: cookbook_file[/opt/parallelcluster/scripts/imds/imds-access.sh] updated file contents /opt/parallelcluster/scripts/imds/imds-access.sh
- update content in file /opt/parallelcluster/scripts/imds/imds-access.sh from none to 690d14
--- /opt/parallelcluster/scripts/imds/imds-access.sh 2022-06-30 15:09:12.526862416 +0000
+++ /opt/parallelcluster/scripts/imds/.chef-imds-access20220630-1116-yh20ek.sh 2022-06-30 15:09:12.526862416 +0000
@@ -1 +1,163 @@
+#!/bin/bash
+set -e
+#
+# Manage the access to IMDS
+#
+# --allow <user1,...,userN> List of users to allow access to IMDS
+# --deny <user1,...,userN> List of users to deny access to IMDS
+# --unset <user1,...,userN> Remove iptables rules related to IMDS for the given list of users
+# --flush Restore default IMDS access
+# --help Print this help message
+
+function error() {
+ >&2 echo "[ERROR] $1"
+ exit 1
+}
+
+function info() {
+ echo "[INFO] $1"
+}
+
+function help() {
+ local -- cmd=$(basename "$0")
+ cat <<EOF
+
+ Usage: ${cmd} [OPTION]...
+
+ Manage the access to IMDS
+
+ --allow <user1,...,userN> Allow IMDS access to the given list of users
+ --deny <user1,...,userN> Deny IMDS access to the given list of users
+ --unset <user1,...,userN> Remove iptables rules related to IMDS for the given list of users
+ --flush Restore default IMDS access
+ --help Print this help message
+EOF
+}
+
+function iptables_delete() {
+ local chain=$1
+ local destination=$2
+ local jump=$3
+ local user=$4
+
+ # Build iptables delete command
+ if [[ -z $user ]]; then
+ rule_args="$chain --destination $destination -j $jump"
+ else
+ rule_args="$chain --destination $destination -j $jump -m owner --uid-owner $user"
+ fi
+
+ local iptables_delete_command="iptables -D $rule_args"
+
+ # Remove rules
+ local should_remove=true
+ while $should_remove; do
+ eval $iptables_delete_command 1>/dev/null 2>/dev/null || should_remove=false
+ done
+}
+
+function iptables_add() {
+ local chain=$1
+ local destination=$2
+ local jump=$3
+ local user=$4
+
+ # Remove duplicate rules
+ iptables_delete $chain $destination $jump $user
+
+ # Remove opposite rules
+ if [[ $jump == "ACCEPT" ]]; then
+ iptables_delete $destination "REJECT" $user
+ elif [[ $jump == "REJECT" ]]; then
+ iptables_delete $destination "ACCEPT" $user
+ fi
+
+ # Build iptables add command
+ if [[ -z $user ]]; then
+ rule_args="$chain --destination $destination -j $jump"
+ else
+ rule_args="$chain --destination $destination -j $jump -m owner --uid-owner $user"
+ fi
+
+ local iptables_add_command="iptables -A $rule_args"
+
+ # Add rule
+ eval $iptables_add_command
+ info "Rule in chain $chain: $destination $jump $user"
+}
+
+function setup_chain() {
+ local chain=$1
+ local source_chain=$2
+ local destination=$3
+
+ iptables --new $chain 2>/dev/null && info "ParallelCluster chain created: $chain" \
+ || info "ParallelCluster chain exists: $chain"
+
+ iptables_add $source_chain $destination $chain
+}
+
+main() {
+ # Constants
+ PARALLELCLUSTER_CHAIN="PARALLELCLUSTER_IMDS"
+ OUTPUT_CHAIN="OUTPUT"
+ IMDS_IP="169.254.169.254"
+
+ # Parse options
+ while [ $# -gt 0 ] ; do
+ case "$1" in
+ --allow) allow_users="$2"; shift;;
+ --deny) deny_users="$2"; shift;;
+ --unset) unset_users="$2"; shift;;
+ --flush) flush="true";;
+ --help) help; exit 0;;
+ *) help; error "Unrecognized option '$1'";;
+ esac
+ shift
+ done
+
+ # Check required commands
+ command -v iptables >/dev/null || error "Cannot find required command: iptables"
+
+ # Check arguments and options
+ if [[ -z $allow_users && -z $deny_users && -z $unset_users && -z $flush ]]; then
+ error "Missing at least one mandatory option: '--allow', '--deny', '--unset', '--flush'"
+ fi
+
+ # Setup ParallelCluster chain
+ setup_chain $PARALLELCLUSTER_CHAIN $OUTPUT_CHAIN $IMDS_IP
+
+ # Flush ParallelCluster chain, if required
+ if [[ $flush == "true" ]]; then
+ iptables --flush $PARALLELCLUSTER_CHAIN
+ info "ParallelCluster chain flushed"
+ exit 0
+ fi
+
+ # Delete rule: ACCEPT/REJECT user, for every user to unset
+ IFS=","
+ for user in $unset_users; do
+ info "Deleting rules related to IMDS access for user: $user"
+ iptables_delete $PARALLELCLUSTER_CHAIN $IMDS_IP "ACCEPT" $user
+ iptables_delete $PARALLELCLUSTER_CHAIN $IMDS_IP "REJECT" $user
+ done
+
+ # Add rule: ACCEPT user, for every allowed user
+ for user in $allow_users; do
+ info "Allowing IMDS access for user: $user"
+ iptables_add $PARALLELCLUSTER_CHAIN $IMDS_IP "ACCEPT" $user
+ done
+
+ # Add rule: REJECT user, for every denied user
+ for user in $deny_users; do
+ info "Denying IMDS access for user: $user"
+ iptables_add $PARALLELCLUSTER_CHAIN $IMDS_IP "REJECT" $user
+ done
+
+ # Add rule: REJECT not allowed users
+ info "Denying IMDS access for not allowed users"
+ iptables_add $PARALLELCLUSTER_CHAIN $IMDS_IP "REJECT"
+}
+
+main "$@"[2022-06-30T15:09:12+00:00] INFO: cookbook_file[/opt/parallelcluster/scripts/imds/imds-access.sh] owner changed to 0
[2022-06-30T15:09:12+00:00] INFO: cookbook_file[/opt/parallelcluster/scripts/imds/imds-access.sh] group changed to 0
[2022-06-30T15:09:12+00:00] INFO: cookbook_file[/opt/parallelcluster/scripts/imds/imds-access.sh] mode changed to 744
- change mode from '' to '0744'
- change owner from '' to 'root'
- change group from '' to 'root'
* execute[IMDS lockdown enable] action run[2022-06-30T15:09:12+00:00] INFO: Processing execute[IMDS lockdown enable] action run (aws-parallelcluster-config::imds line 41)
[execute] [INFO] ParallelCluster chain created: PARALLELCLUSTER_IMDS
[INFO] Rule in chain OUTPUT: 169.254.169.254 PARALLELCLUSTER_IMDS
[INFO] ParallelCluster chain flushed
[INFO] ParallelCluster chain exists: PARALLELCLUSTER_IMDS
[INFO] Rule in chain OUTPUT: 169.254.169.254 PARALLELCLUSTER_IMDS
[INFO] Allowing IMDS access for user: root
[INFO] Rule in chain PARALLELCLUSTER_IMDS: 169.254.169.254 ACCEPT root
[INFO] Allowing IMDS access for user: pcluster-admin
[INFO] Rule in chain PARALLELCLUSTER_IMDS: 169.254.169.254 ACCEPT pcluster-admin
[INFO] Allowing IMDS access for user: ubuntu
[INFO] Rule in chain PARALLELCLUSTER_IMDS: 169.254.169.254 ACCEPT ubuntu
[INFO] Denying IMDS access for not allowed users
[INFO] Rule in chain PARALLELCLUSTER_IMDS: 169.254.169.254 REJECT
[2022-06-30T15:09:12+00:00] INFO: execute[IMDS lockdown enable] ran successfully
- execute bash /opt/parallelcluster/scripts/imds/imds-access.sh --flush && bash /opt/parallelcluster/scripts/imds/imds-access.sh --allow root,pcluster-admin,ubuntu
* execute[Save iptables rules] action run[2022-06-30T15:09:12+00:00] INFO: Processing execute[Save iptables rules] action run (aws-parallelcluster-config::imds line 56)
[2022-06-30T15:09:12+00:00] INFO: execute[Save iptables rules] ran successfully
- execute mkdir -p $(dirname /etc/parallelcluster/sysconfig/iptables.rules) && iptables-save > /etc/parallelcluster/sysconfig/iptables.rules
* template[/etc/init.d/parallelcluster-iptables] action create[2022-06-30T15:09:12+00:00] INFO: Processing template[/etc/init.d/parallelcluster-iptables] action create (aws-parallelcluster-config::imds line 60)
[2022-06-30T15:09:12+00:00] INFO: template[/etc/init.d/parallelcluster-iptables] created file /etc/init.d/parallelcluster-iptables
- create new file /etc/init.d/parallelcluster-iptables[2022-06-30T15:09:12+00:00] INFO: template[/etc/init.d/parallelcluster-iptables] updated file contents /etc/init.d/parallelcluster-iptables
- update content in file /etc/init.d/parallelcluster-iptables from none to d54448
--- /etc/init.d/parallelcluster-iptables 2022-06-30 15:09:12.934872464 +0000
+++ /etc/init.d/.chef-parallelcluster-iptables20220630-1116-zldtbi 2022-06-30 15:09:12.934872464 +0000
@@ -1 +1,46 @@
+#!/bin/bash
+#
+# parallelcluster-iptables
+#
+# chkconfig: 12345 99 99
+# description: Backup and restore iptables rules
+
+### BEGIN INIT INFO
+# Provides: $parallelcluster-iptables
+# Required-Start: $network
+# Required-Stop: $network
+# Default-Start: 1 2 3 4 5
+# Default-Stop: 0 6
+# Short-Description: Backup and restore iptables rules
+# Description: Backup and restore iptables rules
+### END INIT INFO
+
+IPTABLES_RULES_FILE="/etc/parallelcluster/sysconfig/iptables.rules"
+
+function start() {
+ if [[ -f $IPTABLES_RULES_FILE ]]; then
+ iptables-restore < $IPTABLES_RULES_FILE
+ echo "iptables rules restored from file: $IPTABLES_RULES_FILE"
+ else
+ echo "iptables rules left unchanged as file was not found: $IPTABLES_RULES_FILE"
+ fi
+}
+
+function stop() {
+ echo "saving iptables rules to file: $IPTABLES_RULES_FILE"
+ mkdir -p $(dirname $IPTABLES_RULES_FILE)
+ iptables-save > $IPTABLES_RULES_FILE
+ echo "iptables rules saved to file: $IPTABLES_RULES_FILE"
+}
+
+case "$1" in
+start|stop)
+ $1
+ ;;
+*)
+ echo "Usage: $0 {start|stop}"
+ exit 2
+esac
+
+exit $?[2022-06-30T15:09:12+00:00] INFO: template[/etc/init.d/parallelcluster-iptables] owner changed to 0
[2022-06-30T15:09:12+00:00] INFO: template[/etc/init.d/parallelcluster-iptables] group changed to 0
[2022-06-30T15:09:12+00:00] INFO: template[/etc/init.d/parallelcluster-iptables] mode changed to 744
- change mode from '' to '0744'
- change owner from '' to 'root'
- change group from '' to 'root'
* service[parallelcluster-iptables] action enable[2022-06-30T15:09:12+00:00] INFO: Processing service[parallelcluster-iptables] action enable (aws-parallelcluster-config::imds line 68)
[2022-06-30T15:09:14+00:00] INFO: service[parallelcluster-iptables] enabled
- enable service service[parallelcluster-iptables]
* service[parallelcluster-iptables] action start[2022-06-30T15:09:14+00:00] INFO: Processing service[parallelcluster-iptables] action start (aws-parallelcluster-config::imds line 68)
[2022-06-30T15:09:14+00:00] INFO: service[parallelcluster-iptables] started
- start service service[parallelcluster-iptables]
[2022-06-30T15:09:14+00:00] INFO: replace_or_add[set fqdn in the /etc/hosts] sending reload action to ohai[reload_hostname] (delayed)
Recipe: aws-parallelcluster-slurm::init_dns
* ohai[reload_hostname] action reload[2022-06-30T15:09:14+00:00] INFO: Processing ohai[reload_hostname] action reload (aws-parallelcluster-slurm::init_dns line 91)
[2022-06-30T15:09:14+00:00] INFO: ohai[reload_hostname] reloaded
- re-run ohai and merge results into node attributes
[2022-06-30T15:09:14+00:00] WARN: Skipping final node save because override_runlist was given
[2022-06-30T15:09:14+00:00] INFO: Cinc Client Run complete in 28.057885357 seconds
[2022-06-30T15:09:14+00:00] INFO: Skipping removal of unused files from the cache
Running handlers:
[2022-06-30T15:09:14+00:00] INFO: Running report handlers
Running handlers complete
[2022-06-30T15:09:14+00:00] INFO: Report handlers complete
Deprecation warnings that must be addressed before upgrading to Chef Infra 18:
The resource in the nfs cookbook should declare `unified_mode true` at 1 location:
- /etc/chef/local-mode-cache/cache/cookbooks/nfs/resources/export.rb
See https://docs.chef.io/deprecations_unified_mode/ for further details.
The pyenv_global resource in the pyenv cookbook should declare `unified_mode true` at 1 location:
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/global.rb
See https://docs.chef.io/deprecations_unified_mode/ for further details.
The pyenv_pip resource in the pyenv cookbook should declare `unified_mode true` at 1 location:
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/pip.rb
See https://docs.chef.io/deprecations_unified_mode/ for further details.
The pyenv_plugin resource in the pyenv cookbook should declare `unified_mode true` at 1 location:
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/plugin.rb
See https://docs.chef.io/deprecations_unified_mode/ for further details.
The pyenv_python resource in the pyenv cookbook should declare `unified_mode true` at 1 location:
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/python.rb
See https://docs.chef.io/deprecations_unified_mode/ for further details.
The pyenv_rehash resource in the pyenv cookbook should declare `unified_mode true` at 1 location:
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/rehash.rb
See https://docs.chef.io/deprecations_unified_mode/ for further details.
The pyenv_script resource in the pyenv cookbook should declare `unified_mode true` at 1 location:
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/script.rb
See https://docs.chef.io/deprecations_unified_mode/ for further details.
The pyenv_system_install resource in the pyenv cookbook should declare `unified_mode true` at 1 location:
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/system_install.rb
See https://docs.chef.io/deprecations_unified_mode/ for further details.
The pyenv_user_install resource in the pyenv cookbook should declare `unified_mode true` at 1 location:
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/user_install.rb
See https://docs.chef.io/deprecations_unified_mode/ for further details.
The resource in the selinux cookbook should declare `unified_mode true` at 3 locations:
- /etc/chef/local-mode-cache/cache/cookbooks/selinux/resources/install.rb
- /etc/chef/local-mode-cache/cache/cookbooks/selinux/resources/module.rb
- /etc/chef/local-mode-cache/cache/cookbooks/selinux/resources/state.rb
See https://docs.chef.io/deprecations_unified_mode/ for further details.
The resource in the yum cookbook should declare `unified_mode true` at 1 location:
- /etc/chef/local-mode-cache/cache/cookbooks/yum/resources/globalconfig.rb
See https://docs.chef.io/deprecations_unified_mode/ for further details.
Cinc Client finished, 31/36 resources updated in 32 seconds
[2022-06-30T15:09:17+00:00] INFO: Started Cinc Zero at chefzero://localhost:1 with repository at /etc/chef (One version per cookbook)
Starting Cinc Client, version 17.2.29
Patents: https://www.chef.io/patents
[2022-06-30T15:09:17+00:00] INFO: *** Cinc Client 17.2.29 ***
[2022-06-30T15:09:17+00:00] INFO: Platform: x86_64-linux
[2022-06-30T15:09:17+00:00] INFO: Cinc-client pid: 1526
[2022-06-30T15:09:20+00:00] WARN: Run List override has been provided.
[2022-06-30T15:09:20+00:00] WARN: Original Run List: []
[2022-06-30T15:09:20+00:00] WARN: Overridden Run List: [recipe[aws-parallelcluster::config]]
[2022-06-30T15:09:20+00:00] INFO: Run List is [recipe[aws-parallelcluster::config]]
[2022-06-30T15:09:20+00:00] INFO: Run List expands to [aws-parallelcluster::config]
[2022-06-30T15:09:20+00:00] INFO: Starting Cinc Client Run for ip-10-0-0-21.us-east-2.compute.internal
[2022-06-30T15:09:20+00:00] INFO: Running start handlers
[2022-06-30T15:09:20+00:00] INFO: Start handlers complete.
resolving cookbooks for run list: ["aws-parallelcluster::config"]
[2022-06-30T15:09:23+00:00] INFO: Loading cookbooks [aws-parallelcluster@3.1.4, apt@7.4.0, iptables@8.0.0, line@4.0.1, nfs@2.6.4, openssh@2.9.1, pyenv@3.4.2, selinux@3.1.1, yum@6.1.1, yum-epel@4.1.2, aws-parallelcluster-install@3.1.4, aws-parallelcluster-config@3.1.4, aws-parallelcluster-slurm@3.1.4, aws-parallelcluster-scheduler-plugin@3.1.4, aws-parallelcluster-awsbatch@3.1.4, aws-parallelcluster-test@3.1.4]
[2022-06-30T15:09:23+00:00] INFO: Skipping removal of obsoleted cookbooks from the cache
Synchronizing Cookbooks:
- iptables (8.0.0)
- aws-parallelcluster (3.1.4)
- apt (7.4.0)
- line (4.0.1)
- yum (6.1.1)
- selinux (3.1.1)
- nfs (2.6.4)
- pyenv (3.4.2)
- openssh (2.9.1)
- yum-epel (4.1.2)
- aws-parallelcluster-install (3.1.4)
- aws-parallelcluster-awsbatch (3.1.4)
- aws-parallelcluster-config (3.1.4)
- aws-parallelcluster-slurm (3.1.4)
- aws-parallelcluster-scheduler-plugin (3.1.4)
- aws-parallelcluster-test (3.1.4)
Installing Cookbook Gems:
Compiling Cookbooks...
[2022-06-30T15:09:26+00:00] INFO: Detected bootstrap file aws-parallelcluster-cookbook-3.1.4
Converging 68 resources
Recipe: aws-parallelcluster::setup_envars
* ruby_block[Configure environment variable for recipes context: PATH] action run[2022-06-30T15:09:27+00:00] INFO: Processing ruby_block[Configure environment variable for recipes context: PATH] action run (aws-parallelcluster::setup_envars line 23)
[2022-06-30T15:09:27+00:00] INFO: ruby_block[Configure environment variable for recipes context: PATH] called
- execute the ruby block Configure environment variable for recipes context: PATH
* template[/etc/profile.d/path.sh] action create[2022-06-30T15:09:27+00:00] INFO: Processing template[/etc/profile.d/path.sh] action create (aws-parallelcluster::setup_envars line 32)
(up to date)
Recipe: aws-parallelcluster-config::openssh
* template[/usr/bin/ssh_target_checker.sh] action create[2022-06-30T15:09:27+00:00] INFO: Processing template[/usr/bin/ssh_target_checker.sh] action create (aws-parallelcluster-config::openssh line 19)
[2022-06-30T15:09:27+00:00] INFO: template[/usr/bin/ssh_target_checker.sh] created file /usr/bin/ssh_target_checker.sh
- create new file /usr/bin/ssh_target_checker.sh[2022-06-30T15:09:27+00:00] INFO: template[/usr/bin/ssh_target_checker.sh] updated file contents /usr/bin/ssh_target_checker.sh
- update content in file /usr/bin/ssh_target_checker.sh from none to df73f0
--- /usr/bin/ssh_target_checker.sh 2022-06-30 15:09:27.579230955 +0000
+++ /usr/bin/.chef-ssh_target_checker20220630-1526-43dcui.sh 2022-06-30 15:09:27.571230760 +0000
@@ -1 +1,71 @@
+#!/bin/bash
+
+# Copyright 2013-2021 Amazon.com, Inc. or its affiliates. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance with the
+# License. A copy of the License is located at
+#
+# http://aws.amazon.com/apache2.0/
+#
+# or in the "LICENSE.txt" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES
+# OR CONDITIONS OF ANY KIND, express or implied. See the License for the specific language governing permissions and
+# limitations under the License.
+
+set -o pipefail
+
+VPC_CIDR_LIST=(10.0.0.0/16)
+
+log() {
+ echo "$@" | logger -t "pcluster_ssh_target_checker"
+}
+
+convert_ip_to_decimal() {
+ IFS=./ read -r x y z t mask <<< "${1}"
+ echo -n "$((x<<24|y<<16|z<<8|t))"
+}
+
+convert_mask_to_decimal() {
+ IFS=/ read -r _ mask <<< "${1}"
+ echo -n "$((-1<<(32-mask)))"
+}
+
+check_ip_in_cidr() {
+ target_address=$(convert_ip_to_decimal "${1}")
+ base_address=$(convert_ip_to_decimal "${2}")
+ base_mask=$(convert_mask_to_decimal "${2}")
+
+ if (( (target_address&base_mask) == (base_address&base_mask) )); then
+ return 0
+ fi
+
+ return 1
+}
+
+target_host=$1
+if [[ -z "${target_host}" ]]; then
+ log "No input target host"
+ exit 1
+fi
+
+if ! resolved_ip=$(getent ahosts "${target_host}" | grep -v : | head -1 | cut -d' ' -f1); then
+ log "Cannot resolve target Host ${target_host}"
+ exit 1
+fi
+
+if [[ "${resolved_ip}" == "127.0.0.1" ]]; then
+ # Special case for localhost
+ log "Target Host ${target_host} is in VPC CIDR"
+ exit 0
+fi
+
+for vpc_cidr in "${VPC_CIDR_LIST[@]}"
+do
+ if check_ip_in_cidr "${resolved_ip}" "${vpc_cidr}"; then
+ log "Target Host ${target_host} is in VPC CIDR ${vpc_cidr}"
+ exit 0
+ fi
+done
+
+log "Target Host ${target_host} is not in any VPC CIDR ${vpc_cidr_list[*]}"
+exit 1[2022-06-30T15:09:27+00:00] INFO: template[/usr/bin/ssh_target_checker.sh] owner changed to 0
[2022-06-30T15:09:27+00:00] INFO: template[/usr/bin/ssh_target_checker.sh] group changed to 0
[2022-06-30T15:09:27+00:00] INFO: template[/usr/bin/ssh_target_checker.sh] mode changed to 755
- change mode from '' to '0755'
- change owner from '' to 'root'
- change group from '' to 'root'
Recipe: aws-parallelcluster-config::base
* sysctl[fs.protected_regular] action apply[2022-06-30T15:09:27+00:00] INFO: Processing sysctl[fs.protected_regular] action apply (aws-parallelcluster-config::base line 23)
* directory[/etc/sysctl.d] action create[2022-06-30T15:09:27+00:00] INFO: Processing directory[/etc/sysctl.d] action create (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 139)
(up to date)
* file[/etc/sysctl.d/99-chef-fs.protected_regular.conf] action create[2022-06-30T15:09:27+00:00] INFO: Processing file[/etc/sysctl.d/99-chef-fs.protected_regular.conf] action create (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 141)
[2022-06-30T15:09:27+00:00] INFO: file[/etc/sysctl.d/99-chef-fs.protected_regular.conf] created file /etc/sysctl.d/99-chef-fs.protected_regular.conf
- create new file /etc/sysctl.d/99-chef-fs.protected_regular.conf[2022-06-30T15:09:27+00:00] INFO: file[/etc/sysctl.d/99-chef-fs.protected_regular.conf] updated file contents /etc/sysctl.d/99-chef-fs.protected_regular.conf
- update content in file /etc/sysctl.d/99-chef-fs.protected_regular.conf from none to e8e418
--- /etc/sysctl.d/99-chef-fs.protected_regular.conf 2022-06-30 15:09:27.707234087 +0000
+++ /etc/sysctl.d/.chef-99-chef-fs20220630-1526-g0pt9p.protected_regular.conf 2022-06-30 15:09:27.707234087 +0000
@@ -1 +1,2 @@
+fs.protected_regular = 0
* execute[Load sysctl values] action run[2022-06-30T15:09:27+00:00] INFO: Processing execute[Load sysctl values] action run (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 145)
[2022-06-30T15:09:27+00:00] INFO: execute[Load sysctl values] ran successfully
- execute sysctl -p
- create fs.protected_regular
- set value to "0"
- set comment to [] (default value)
- set conf_dir to "/etc/sysctl.d" (default value)
Recipe: nfs::_common
* apt_package[nfs-common] action install[2022-06-30T15:09:27+00:00] INFO: Processing apt_package[nfs-common] action install (nfs::_common line 22)
(up to date)
* apt_package[rpcbind] action install[2022-06-30T15:09:31+00:00] INFO: Processing apt_package[rpcbind] action install (nfs::_common line 22)
(up to date)
* directory[/etc/default] action create[2022-06-30T15:09:31+00:00] INFO: Processing directory[/etc/default] action create (nfs::_common line 26)
(skipped due to only_if)
* template[/etc/default/nfs-common] action create[2022-06-30T15:09:31+00:00] INFO: Processing template[/etc/default/nfs-common] action create (nfs::_common line 36)
[2022-06-30T15:09:31+00:00] INFO: template[/etc/default/nfs-common] backed up to /etc/chef/local-mode-cache/backup/etc/default/nfs-common.chef-20220630150931.412023
[2022-06-30T15:09:31+00:00] INFO: template[/etc/default/nfs-common] updated file contents /etc/default/nfs-common
- update content in file /etc/default/nfs-common from 1bf5d6 to 89c769
--- /etc/default/nfs-common 2022-05-12 10:22:25.460099899 +0000
+++ /etc/default/.chef-nfs-common20220630-1526-z3yn8h 2022-06-30 15:09:31.407324988 +0000
@@ -1,3 +1,3 @@
-# Generated by Chef for ip-172-31-0-151.ec2.internal# Local modifications will be overwritten.
+# Generated by Chef for ip-10-0-0-21.us-east-2.compute.internal# Local modifications will be overwritten.
STATDOPTS="--port 32765 --outgoing-port 32766"
[2022-06-30T15:09:31+00:00] INFO: template[/etc/default/nfs-common] sending restart action to service[portmap] (immediate)
* service[portmap] action restart[2022-06-30T15:09:31+00:00] INFO: Processing service[portmap] action restart (nfs::_common line 46)
[2022-06-30T15:09:31+00:00] INFO: service[portmap] restarted
- restart service service[portmap]
[2022-06-30T15:09:31+00:00] INFO: template[/etc/default/nfs-common] sending restart action to service[lock] (immediate)
* service[lock] action restart[2022-06-30T15:09:31+00:00] INFO: Processing service[lock] action restart (nfs::_common line 46)
[2022-06-30T15:09:31+00:00] INFO: service[lock] restarted
- restart service service[lock]
[2022-06-30T15:09:31+00:00] INFO: template[/etc/default/nfs-common] sending restart action to service[nfs-config.service] (immediate)
* service[nfs-config.service] action restart[2022-06-30T15:09:31+00:00] INFO: Processing service[nfs-config.service] action restart (nfs::_common line 46)
[2022-06-30T15:09:31+00:00] INFO: service[nfs-config.service] restarted
- restart service service[nfs-config.service]
* template[/etc/modprobe.d/lockd.conf] action create[2022-06-30T15:09:31+00:00] INFO: Processing template[/etc/modprobe.d/lockd.conf] action create (nfs::_common line 36)
[2022-06-30T15:09:31+00:00] INFO: template[/etc/modprobe.d/lockd.conf] backed up to /etc/chef/local-mode-cache/backup/etc/modprobe.d/lockd.conf.chef-20220630150931.924946
[2022-06-30T15:09:31+00:00] INFO: template[/etc/modprobe.d/lockd.conf] updated file contents /etc/modprobe.d/lockd.conf
- update content in file /etc/modprobe.d/lockd.conf from 2bf649 to 859601
--- /etc/modprobe.d/lockd.conf 2022-05-12 10:22:25.676099974 +0000
+++ /etc/modprobe.d/.chef-lockd20220630-1526-6lzsix.conf 2022-06-30 15:09:31.919337663 +0000
@@ -1,4 +1,4 @@
-# Generated by Chef for ip-172-31-0-151.ec2.internal
+# Generated by Chef for ip-10-0-0-21.us-east-2.compute.internal
# Local modifications will be overwritten.
options lockd nlm_udpport=32768 nlm_tcpport=32768
[2022-06-30T15:09:31+00:00] INFO: template[/etc/modprobe.d/lockd.conf] sending restart action to service[portmap] (immediate)
* service[portmap] action restart[2022-06-30T15:09:31+00:00] INFO: Processing service[portmap] action restart (nfs::_common line 46)
[2022-06-30T15:09:32+00:00] INFO: service[portmap] restarted
- restart service service[portmap]
[2022-06-30T15:09:32+00:00] INFO: template[/etc/modprobe.d/lockd.conf] sending restart action to service[lock] (immediate)
* service[lock] action restart[2022-06-30T15:09:32+00:00] INFO: Processing service[lock] action restart (nfs::_common line 46)
[2022-06-30T15:09:32+00:00] INFO: service[lock] restarted
- restart service service[lock]
[2022-06-30T15:09:32+00:00] INFO: template[/etc/modprobe.d/lockd.conf] sending restart action to service[nfs-config.service] (immediate)
* service[nfs-config.service] action restart[2022-06-30T15:09:32+00:00] INFO: Processing service[nfs-config.service] action restart (nfs::_common line 46)
[2022-06-30T15:09:32+00:00] INFO: service[nfs-config.service] restarted
- restart service service[nfs-config.service]
* service[portmap] action start[2022-06-30T15:09:32+00:00] INFO: Processing service[portmap] action start (nfs::_common line 46)
(up to date)
* service[portmap] action enable[2022-06-30T15:09:32+00:00] INFO: Processing service[portmap] action enable (nfs::_common line 46)
(up to date)
* service[lock] action start[2022-06-30T15:09:32+00:00] INFO: Processing service[lock] action start (nfs::_common line 46)
(up to date)
* service[lock] action enable[2022-06-30T15:09:32+00:00] INFO: Processing service[lock] action enable (nfs::_common line 46)
(up to date)
* service[nfs-config.service] action start[2022-06-30T15:09:32+00:00] INFO: Processing service[nfs-config.service] action start (nfs::_common line 46)
[2022-06-30T15:09:32+00:00] INFO: service[nfs-config.service] started
- start service service[nfs-config.service]
* service[nfs-config.service] action enable[2022-06-30T15:09:32+00:00] INFO: Processing service[nfs-config.service] action enable (nfs::_common line 46)
(up to date)
Recipe: nfs::server
* apt_package[nfs-kernel-server] action install[2022-06-30T15:09:32+00:00] INFO: Processing apt_package[nfs-kernel-server] action install (nfs::server line 23)
(up to date)
* template[/etc/default/nfs-kernel-server] action create[2022-06-30T15:09:32+00:00] INFO: Processing template[/etc/default/nfs-kernel-server] action create (nfs::server line 30)
[2022-06-30T15:09:32+00:00] INFO: template[/etc/default/nfs-kernel-server] backed up to /etc/chef/local-mode-cache/backup/etc/default/nfs-kernel-server.chef-20220630150932.773271
[2022-06-30T15:09:32+00:00] INFO: template[/etc/default/nfs-kernel-server] updated file contents /etc/default/nfs-kernel-server
- update content in file /etc/default/nfs-kernel-server from 5ba45c to 1c890d
--- /etc/default/nfs-kernel-server 2022-05-12 10:22:32.092102192 +0000
+++ /etc/default/.chef-nfs-kernel-server20220630-1526-ptjcf8 2022-06-30 15:09:32.767358655 +0000
@@ -1,4 +1,4 @@
-# Generated by Chef for ip-172-31-0-151.ec2.internal# Local modifications will be overwritten.
+# Generated by Chef for ip-10-0-0-21.us-east-2.compute.internal# Local modifications will be overwritten.
# Rendered Debian/Ubuntu template variant
RPCMOUNTDOPTS="-p 32767"
RPCNFSDCOUNT="8"
* service[nfs-kernel-server] action start[2022-06-30T15:09:32+00:00] INFO: Processing service[nfs-kernel-server] action start (nfs::server line 42)
(up to date)
* service[nfs-kernel-server] action enable[2022-06-30T15:09:32+00:00] INFO: Processing service[nfs-kernel-server] action enable (nfs::server line 42)
(up to date)
Recipe: nfs::_idmap
* template[/etc/idmapd.conf] action create[2022-06-30T15:09:32+00:00] INFO: Processing template[/etc/idmapd.conf] action create (nfs::_idmap line 23)
[2022-06-30T15:09:32+00:00] INFO: template[/etc/idmapd.conf] backed up to /etc/chef/local-mode-cache/backup/etc/idmapd.conf.chef-20220630150932.915870
[2022-06-30T15:09:32+00:00] INFO: template[/etc/idmapd.conf] updated file contents /etc/idmapd.conf
- update content in file /etc/idmapd.conf from b0488e to b10c33
--- /etc/idmapd.conf 2021-05-12 19:30:06.000000000 +0000
+++ /etc/.chef-idmapd20220630-1526-9cg9sn.conf 2022-06-30 15:09:32.907362122 +0000
@@ -2,11 +2,35 @@
Verbosity = 0
Pipefs-Directory = /run/rpc_pipefs
-# set your own domain here, if it differs from FQDN minus hostname
-# Domain = localdomain
+# The following should be set to the local NFSv4 domain name
+# The default is the host's DNS domain name.
+Domain = us-east-2.compute.internal
+
+# The following is a comma-separated list of Kerberos realm
+# names that should be considered to be equivalent to the
+# local realm, such that <user>@REALM.A can be assumed to
+# be the same user as <user>@REALM.B
+# If not specified, the default local realm is the domain name,
+# which defaults to the host's DNS domain name,
+# translated to upper-case.
+# Note that if this value is specified, the local realm name
+# must be included in the list!
+#Local-Realms =
+
[Mapping]
Nobody-User = nobody
Nobody-Group = nogroup
+
+[Translation]
+
+# Translation Method is an comma-separated, ordered list of
+# translation methods that can be used. Distributed methods
+# include "nsswitch", "umich_ldap", and "static". Each method
+# is a dynamically loadable plugin library.
+# New methods may be defined and inserted in the list.
+# The default is "nsswitch".
+Method = nsswitch
+
[2022-06-30T15:09:32+00:00] INFO: template[/etc/idmapd.conf] sending restart action to service[idmap] (immediate)
* service[idmap] action restart[2022-06-30T15:09:32+00:00] INFO: Processing service[idmap] action restart (nfs::_idmap line 29)
[2022-06-30T15:09:33+00:00] INFO: service[idmap] restarted
- restart service service[idmap]
* service[idmap] action start[2022-06-30T15:09:33+00:00] INFO: Processing service[idmap] action start (nfs::_idmap line 29)
(up to date)
* service[idmap] action enable[2022-06-30T15:09:33+00:00] INFO: Processing service[idmap] action enable (nfs::_idmap line 29)
(up to date)
Recipe: aws-parallelcluster-config::nfs
* service[nfs-kernel-server] action restart[2022-06-30T15:09:33+00:00] INFO: Processing service[nfs-kernel-server] action restart (aws-parallelcluster-config::nfs line 39)
[2022-06-30T15:09:34+00:00] INFO: service[nfs-kernel-server] restarted
- restart service service[nfs-kernel-server]
Recipe: aws-parallelcluster-config::base
* service[setup-ephemeral] action enable[2022-06-30T15:09:34+00:00] INFO: Processing service[setup-ephemeral] action enable (aws-parallelcluster-config::base line 30)
[2022-06-30T15:09:35+00:00] INFO: service[setup-ephemeral] enabled
- enable service service[setup-ephemeral]
* execute[Setup of ephemeral drivers] action run[2022-06-30T15:09:35+00:00] INFO: Processing execute[Setup of ephemeral drivers] action run (aws-parallelcluster-config::base line 37)
[execute] ParallelCluster - [INFO] This instance type doesn't have instance store
[2022-06-30T15:09:35+00:00] INFO: execute[Setup of ephemeral drivers] ran successfully
- execute /usr/local/sbin/setup-ephemeral-drives.sh
* sysctl[net.core.somaxconn] action apply[2022-06-30T15:09:35+00:00] INFO: Processing sysctl[net.core.somaxconn] action apply (aws-parallelcluster-config::base line 44)
* directory[/etc/sysctl.d] action create[2022-06-30T15:09:36+00:00] INFO: Processing directory[/etc/sysctl.d] action create (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 139)
(up to date)
* file[/etc/sysctl.d/99-chef-net.core.somaxconn.conf] action create[2022-06-30T15:09:36+00:00] INFO: Processing file[/etc/sysctl.d/99-chef-net.core.somaxconn.conf] action create (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 141)
[2022-06-30T15:09:36+00:00] INFO: file[/etc/sysctl.d/99-chef-net.core.somaxconn.conf] created file /etc/sysctl.d/99-chef-net.core.somaxconn.conf
- create new file /etc/sysctl.d/99-chef-net.core.somaxconn.conf[2022-06-30T15:09:36+00:00] INFO: file[/etc/sysctl.d/99-chef-net.core.somaxconn.conf] updated file contents /etc/sysctl.d/99-chef-net.core.somaxconn.conf
- update content in file /etc/sysctl.d/99-chef-net.core.somaxconn.conf from none to 364f9b
--- /etc/sysctl.d/99-chef-net.core.somaxconn.conf 2022-06-30 15:09:36.059440150 +0000
+++ /etc/sysctl.d/.chef-99-chef-net20220630-1526-wi832c.core.somaxconn.conf 2022-06-30 15:09:36.059440150 +0000
@@ -1 +1,2 @@
+net.core.somaxconn = 65535
* execute[Load sysctl values] action run[2022-06-30T15:09:36+00:00] INFO: Processing execute[Load sysctl values] action run (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 145)
[2022-06-30T15:09:36+00:00] INFO: execute[Load sysctl values] ran successfully
- execute sysctl -p
- create net.core.somaxconn
- set value to "65535"
- set comment to [] (default value)
- set conf_dir to "/etc/sysctl.d" (default value)
* sysctl[net.ipv4.tcp_max_syn_backlog] action apply[2022-06-30T15:09:36+00:00] INFO: Processing sysctl[net.ipv4.tcp_max_syn_backlog] action apply (aws-parallelcluster-config::base line 48)
* directory[/etc/sysctl.d] action create[2022-06-30T15:09:36+00:00] INFO: Processing directory[/etc/sysctl.d] action create (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 139)
(up to date)
* file[/etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf] action create[2022-06-30T15:09:36+00:00] INFO: Processing file[/etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf] action create (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 141)
[2022-06-30T15:09:36+00:00] INFO: file[/etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf] created file /etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf
- create new file /etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf[2022-06-30T15:09:36+00:00] INFO: file[/etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf] updated file contents /etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf
- update content in file /etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf from none to 97069c
--- /etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf 2022-06-30 15:09:36.215444012 +0000
+++ /etc/sysctl.d/.chef-99-chef-net20220630-1526-90ip3s.ipv4.tcp_max_syn_backlog.conf 2022-06-30 15:09:36.215444012 +0000
@@ -1 +1,2 @@
+net.ipv4.tcp_max_syn_backlog = 65535
* execute[Load sysctl values] action run[2022-06-30T15:09:36+00:00] INFO: Processing execute[Load sysctl values] action run (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 145)
[2022-06-30T15:09:36+00:00] INFO: execute[Load sysctl values] ran successfully
- execute sysctl -p
- create net.ipv4.tcp_max_syn_backlog
- set value to "65535"
- set comment to [] (default value)
- set conf_dir to "/etc/sysctl.d" (default value)
Recipe: aws-parallelcluster-config::chrony
* service[chrony] action enable[2022-06-30T15:09:36+00:00] INFO: Processing service[chrony] action enable (aws-parallelcluster-config::chrony line 18)
(up to date)
* service[chrony] action start[2022-06-30T15:09:36+00:00] INFO: Processing service[chrony] action start (aws-parallelcluster-config::chrony line 18)
(up to date)
Recipe: aws-parallelcluster-config::head_node_base
* execute[attach_volume_0] action run[2022-06-30T15:09:36+00:00] INFO: Processing execute[attach_volume_0] action run (aws-parallelcluster-config::head_node_base line 42)
[execute] Traceback (most recent call last):
File "/usr/local/sbin/attachVolume.py", line 152, in <module>
main()
File "/usr/local/sbin/attachVolume.py", line 130, in main
response = ec2.attach_volume(VolumeId=volume_id, InstanceId=instance_id, Device=dev)
File "/opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/lib/python3.7/site-packages/botocore/client.py", line 508, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/lib/python3.7/site-packages/botocore/client.py", line 911, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (VolumeInUse) when calling the AttachVolume operation: vol-0b087667ecac188e4 is already attached to an instance
================================================================================
Error executing action `run` on resource 'execute[attach_volume_0]'
================================================================================
Mixlib::ShellOut::ShellCommandFailed
------------------------------------
Expected process to exit with [0], but received '1'
---- Begin output of /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/sbin/attachVolume.py vol-0b087667ecac188e4 ----
STDOUT:
STDERR: Traceback (most recent call last):
File "/usr/local/sbin/attachVolume.py", line 152, in <module>
main()
File "/usr/local/sbin/attachVolume.py", line 130, in main
response = ec2.attach_volume(VolumeId=volume_id, InstanceId=instance_id, Device=dev)
File "/opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/lib/python3.7/site-packages/botocore/client.py", line 508, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/lib/python3.7/site-packages/botocore/client.py", line 911, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (VolumeInUse) when calling the AttachVolume operation: vol-0b087667ecac188e4 is already attached to an instance
---- End output of /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/sbin/attachVolume.py vol-0b087667ecac188e4 ----
Ran /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/sbin/attachVolume.py vol-0b087667ecac188e4 returned 1
Resource Declaration:
---------------------
# In /etc/chef/local-mode-cache/cache/cookbooks/aws-parallelcluster-config/recipes/head_node_base.rb
42: execute "attach_volume_#{index}" do
43: command "#{node.default['cluster']['cookbook_virtualenv_path']}/bin/python /usr/local/sbin/attachVolume.py #{volumeid}"
44: creates dev_path[index]
45: end
46:
Compiled Resource:
------------------
# Declared in /etc/chef/local-mode-cache/cache/cookbooks/aws-parallelcluster-config/recipes/head_node_base.rb:42:in `block in from_file'
execute("attach_volume_0") do
action [:run]
default_guard_interpreter :execute
command "/opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/sbin/attachVolume.py vol-0b087667ecac188e4"
declared_type :execute
cookbook_name "aws-parallelcluster-config"
recipe_name "head_node_base"
user nil
domain nil
creates "/dev/disk/by-ebs-volumeid/vol-0b087667ecac188e4"
end
System Info:
------------
chef_version=17.2.29
platform=ubuntu
platform_version=20.04
ruby=ruby 3.0.1p64 (2021-04-05 revision 0fb782ee38) [x86_64-linux]
program_name=/bin/cinc-client
executable=/opt/cinc/bin/cinc-client
[2022-06-30T15:09:37+00:00] INFO: Running queued delayed notifications before re-raising exception
[2022-06-30T15:09:37+00:00] INFO: template[/etc/default/nfs-kernel-server] sending restart action to service[nfs-kernel-server] (delayed)
Recipe: aws-parallelcluster-config::nfs
* service[nfs-kernel-server] action restart[2022-06-30T15:09:37+00:00] INFO: Processing service[nfs-kernel-server] action restart (aws-parallelcluster-config::nfs line 39)
[2022-06-30T15:09:39+00:00] INFO: service[nfs-kernel-server] restarted
- restart service service[nfs-kernel-server]
Running handlers:
[2022-06-30T15:09:39+00:00] ERROR: Running exception handlers
Running handlers complete
[2022-06-30T15:09:39+00:00] ERROR: Exception handlers complete
Cinc Client failed. 27 resources updated in 21 seconds
[2022-06-30T15:09:39+00:00] FATAL: Stacktrace dumped to /etc/chef/local-mode-cache/cache/cinc-stacktrace.out
[2022-06-30T15:09:39+00:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2022-06-30T15:09:39+00:00] FATAL: Mixlib::ShellOut::ShellCommandFailed: execute[attach_volume_0] (aws-parallelcluster-config::head_node_base line 42) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
---- Begin output of /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/sbin/attachVolume.py vol-0b087667ecac188e4 ----
STDOUT:
STDERR: Traceback (most recent call last):
File "/usr/local/sbin/attachVolume.py", line 152, in <module>
main()
File "/usr/local/sbin/attachVolume.py", line 130, in main
response = ec2.attach_volume(VolumeId=volume_id, InstanceId=instance_id, Device=dev)
File "/opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/lib/python3.7/site-packages/botocore/client.py", line 508, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/lib/python3.7/site-packages/botocore/client.py", line 911, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (VolumeInUse) when calling the AttachVolume operation: vol-0b087667ecac188e4 is already attached to an instance
---- End output of /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/sbin/attachVolume.py vol-0b087667ecac188e4 ----
Ran /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/sbin/attachVolume.py vol-0b087667ecac188e4 returned 1
ubuntu@ip-10-0-0-21:/var/log$
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment