Created
June 30, 2022 15:16
-
-
Save arpit15/807338e7c701b777186145419ea19c76 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ubuntu@ip-10-0-0-21:/var/log$ cat chef-client.log | |
# Logfile created on 2022-06-30 15:08:39 +0000 by logger.rb/v1.4.3 | |
[2022-06-30T15:08:41+00:00] INFO: Started Cinc Zero at chefzero://localhost:1 with repository at /etc/chef (One version per cookbook) | |
Starting Cinc Client, version 17.2.29 | |
Patents: https://www.chef.io/patents | |
[2022-06-30T15:08:42+00:00] INFO: *** Cinc Client 17.2.29 *** | |
[2022-06-30T15:08:42+00:00] INFO: Platform: x86_64-linux | |
[2022-06-30T15:08:42+00:00] INFO: Cinc-client pid: 1116 | |
[2022-06-30T15:08:46+00:00] WARN: Run List override has been provided. | |
[2022-06-30T15:08:46+00:00] WARN: Original Run List: [] | |
[2022-06-30T15:08:46+00:00] WARN: Overridden Run List: [recipe[aws-parallelcluster::init]] | |
[2022-06-30T15:08:46+00:00] INFO: Run List is [recipe[aws-parallelcluster::init]] | |
[2022-06-30T15:08:46+00:00] INFO: Run List expands to [aws-parallelcluster::init] | |
[2022-06-30T15:08:46+00:00] INFO: Starting Cinc Client Run for ip-10-0-0-21.us-east-2.compute.internal | |
[2022-06-30T15:08:46+00:00] INFO: Running start handlers | |
[2022-06-30T15:08:46+00:00] INFO: Start handlers complete. | |
resolving cookbooks for run list: ["aws-parallelcluster::init"] | |
[2022-06-30T15:08:51+00:00] INFO: Loading cookbooks [aws-parallelcluster@3.1.4, apt@7.4.0, iptables@8.0.0, line@4.0.1, nfs@2.6.4, openssh@2.9.1, pyenv@3.4.2, selinux@3.1.1, yum@6.1.1, yum-epel@4.1.2, aws-parallelcluster-install@3.1.4, aws-parallelcluster-config@3.1.4, aws-parallelcluster-slurm@3.1.4, aws-parallelcluster-scheduler-plugin@3.1.4, aws-parallelcluster-awsbatch@3.1.4, aws-parallelcluster-test@3.1.4] | |
[2022-06-30T15:08:51+00:00] INFO: Skipping removal of obsoleted cookbooks from the cache | |
Synchronizing Cookbooks: | |
- aws-parallelcluster (3.1.4) | |
- apt (7.4.0) | |
- iptables (8.0.0) | |
- line (4.0.1) | |
- nfs (2.6.4) | |
- openssh (2.9.1) | |
- pyenv (3.4.2) | |
- selinux (3.1.1) | |
- yum (6.1.1) | |
- yum-epel (4.1.2) | |
- aws-parallelcluster-install (3.1.4) | |
- aws-parallelcluster-config (3.1.4) | |
- aws-parallelcluster-slurm (3.1.4) | |
- aws-parallelcluster-scheduler-plugin (3.1.4) | |
- aws-parallelcluster-awsbatch (3.1.4) | |
- aws-parallelcluster-test (3.1.4) | |
Installing Cookbook Gems: | |
Compiling Cookbooks... | |
[2022-06-30T15:08:54+00:00] INFO: Detected bootstrap file aws-parallelcluster-cookbook-3.1.4 | |
[2022-06-30T15:08:54+00:00] INFO: Appending search domain 'compute1.pcluster.' to /etc/systemd/resolved.conf | |
[2022-06-30T15:08:54+00:00] INFO: Restarting 'systemd-resolved' service, platform ubuntu '20.04' | |
Converging 24 resources | |
Recipe: aws-parallelcluster-config::init | |
* template[/etc/parallelcluster/cfnconfig] action create[2022-06-30T15:08:55+00:00] INFO: Processing template[/etc/parallelcluster/cfnconfig] action create (aws-parallelcluster-config::init line 39) | |
[2022-06-30T15:08:55+00:00] INFO: template[/etc/parallelcluster/cfnconfig] created file /etc/parallelcluster/cfnconfig | |
- create new file /etc/parallelcluster/cfnconfig[2022-06-30T15:08:55+00:00] INFO: template[/etc/parallelcluster/cfnconfig] updated file contents /etc/parallelcluster/cfnconfig | |
- update content in file /etc/parallelcluster/cfnconfig from none to 0cf156 | |
--- /etc/parallelcluster/cfnconfig 2022-06-30 15:08:55.602445872 +0000 | |
+++ /etc/parallelcluster/.chef-cfnconfig20220630-1116-jotb8o 2022-06-30 15:08:55.602445872 +0000 | |
@@ -1 +1,16 @@ | |
+stack_name=compute1 | |
+cfn_preinstall=NONE | |
+cfn_preinstall_args=(NONE) | |
+cfn_postinstall=NONE | |
+cfn_postinstall_args=(NONE) | |
+cfn_region=us-east-2 | |
+cfn_scheduler=slurm | |
+cfn_scheduler_slots=vcpus | |
+cfn_instance_slots=1 | |
+cfn_ephemeral_dir=/scratch | |
+cfn_ebs_shared_dirs=/data | |
+cfn_proxy=NONE | |
+cfn_node_type=HeadNode | |
+cfn_cluster_user=ubuntu | |
+cfn_volume=vol-0b087667ecac188e4[2022-06-30T15:08:55+00:00] INFO: template[/etc/parallelcluster/cfnconfig] mode changed to 644 | |
- change mode from '' to '0644' | |
* link[/opt/parallelcluster/cfnconfig] action create[2022-06-30T15:08:55+00:00] INFO: Processing link[/opt/parallelcluster/cfnconfig] action create (aws-parallelcluster-config::init line 44) | |
[2022-06-30T15:08:55+00:00] INFO: link[/opt/parallelcluster/cfnconfig] created | |
- create symlink at /opt/parallelcluster/cfnconfig to /etc/parallelcluster/cfnconfig | |
* template[/opt/parallelcluster/scripts/fetch_and_run] action create[2022-06-30T15:08:55+00:00] INFO: Processing template[/opt/parallelcluster/scripts/fetch_and_run] action create (aws-parallelcluster-config::init line 48) | |
[2022-06-30T15:08:55+00:00] INFO: template[/opt/parallelcluster/scripts/fetch_and_run] created file /opt/parallelcluster/scripts/fetch_and_run | |
- create new file /opt/parallelcluster/scripts/fetch_and_run[2022-06-30T15:08:55+00:00] INFO: template[/opt/parallelcluster/scripts/fetch_and_run] updated file contents /opt/parallelcluster/scripts/fetch_and_run | |
- update content in file /opt/parallelcluster/scripts/fetch_and_run from none to 1eb47f | |
--- /opt/parallelcluster/scripts/fetch_and_run 2022-06-30 15:08:55.666447442 +0000 | |
+++ /opt/parallelcluster/scripts/.chef-fetch_and_run20220630-1116-zss57f 2022-06-30 15:08:55.666447442 +0000 | |
@@ -1 +1,73 @@ | |
+#!/bin/bash | |
+ | |
+cfnconfig_file="/etc/parallelcluster/cfnconfig" | |
+. ${cfnconfig_file} | |
+ | |
+# Check expected variables from cfnconfig file | |
+function check_params () { | |
+ if [ -z "${cfn_region}" ] || [ -z "${cfn_preinstall}" ] || [ -z "${cfn_preinstall_args}" ] || [ -z "${cfn_postinstall}" ] || [ -z "${cfn_postinstall_args}" ]; then | |
+ error_exit "One or more required variables from ${cfnconfig_file} file are undefined" | |
+ fi | |
+} | |
+ | |
+# Error exit function | |
+function error_exit () { | |
+ script=`basename $0` | |
+ echo "parallelcluster: ${script} - $1" | |
+ logger -t parallelcluster "${script} - $1" | |
+ exit 1 | |
+} | |
+ | |
+function download_run (){ | |
+ url=$1 | |
+ shift | |
+ scheme=$(echo "${url}"| cut -d: -f1) | |
+ tmpfile=$(mktemp) | |
+ trap "/bin/rm -f $tmpfile" RETURN | |
+ if [ "${scheme}" == "s3" ]; then | |
+ /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/aws --region ${cfn_region} s3 cp ${url} - > $tmpfile || return 1 | |
+ else | |
+ wget -qO- ${url} > $tmpfile || return 1 | |
+ fi | |
+ chmod +x $tmpfile || return 1 | |
+ $tmpfile "$@" || error_exit "Failed to run ${ACTION}, ${file} failed with non 0 return code: $?" | |
+} | |
+ | |
+function run_preinstall () { | |
+ if [ "${cfn_preinstall}" != "NONE" ]; then | |
+ file="${cfn_preinstall}" | |
+ if [ "${cfn_preinstall_args}" != "NONE" ]; then | |
+ download_run ${cfn_preinstall} "${cfn_preinstall_args[@]}" | |
+ else | |
+ download_run ${cfn_preinstall} | |
+ fi | |
+ fi || error_exit "Failed to run preinstall" | |
+} | |
+ | |
+function run_postinstall () { | |
+ RC=0 | |
+ if [ "${cfn_postinstall}" != "NONE" ]; then | |
+ file="${cfn_postinstall}" | |
+ if [ "${cfn_postinstall_args}" != "NONE" ]; then | |
+ download_run ${cfn_postinstall} "${cfn_postinstall_args[@]}" | |
+ else | |
+ download_run ${cfn_postinstall} | |
+ fi | |
+ fi || error_exit "Failed to run postinstall" | |
+} | |
+ | |
+check_params | |
+ | |
+ACTION=${1#?} | |
+case ${ACTION} in | |
+ preinstall) | |
+ run_preinstall | |
+ ;; | |
+ postinstall) | |
+ run_postinstall | |
+ ;; | |
+ *) | |
+ echo "Unknown action. Exit gracefully" | |
+ exit 0 | |
+esac[2022-06-30T15:08:55+00:00] INFO: template[/opt/parallelcluster/scripts/fetch_and_run] owner changed to 0 | |
[2022-06-30T15:08:55+00:00] INFO: template[/opt/parallelcluster/scripts/fetch_and_run] group changed to 0 | |
[2022-06-30T15:08:55+00:00] INFO: template[/opt/parallelcluster/scripts/fetch_and_run] mode changed to 755 | |
- change mode from '' to '0755' | |
- change owner from '' to 'root' | |
- change group from '' to 'root' | |
* fetch_config[Fetch and load cluster configs] action run[2022-06-30T15:08:55+00:00] INFO: Processing fetch_config[Fetch and load cluster configs] action run (aws-parallelcluster-config::init line 57) | |
* execute[copy_cluster_config_from_s3] action run[2022-06-30T15:08:55+00:00] INFO: Processing execute[copy_cluster_config_from_s3] action run (/etc/chef/local-mode-cache/cache/cookbooks/aws-parallelcluster-config/resources/fetch_config.rb line 39) | |
[execute] { | |
"AcceptRanges": "bytes", | |
"LastModified": "Thu, 30 Jun 2022 15:03:57 GMT", | |
"ContentLength": 2393, | |
"ETag": "\"f46e93f2e1d20766b21d01c749cd9024\"", | |
"VersionId": "ejW58vArA8P12goCpkK88TZUjFZNFL3z", | |
"ContentType": "binary/octet-stream", | |
"ServerSideEncryption": "AES256", | |
"Metadata": {} | |
} | |
[2022-06-30T15:08:58+00:00] INFO: execute[copy_cluster_config_from_s3] ran successfully | |
- execute /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/aws s3api get-object --bucket parallelcluster-2e5bf78a0b005f30-v1-do-not-delete --key parallelcluster/3.1.4/clusters/compute1-9k2cwodproy9ug0m/configs/cluster-config-with-implied-values.yaml --region us-east-2 /opt/parallelcluster/shared/cluster-config.yaml --version-id ejW58vArA8P12goCpkK88TZUjFZNFL3z | |
* execute[copy_instance_type_data_from_s3] action run[2022-06-30T15:08:58+00:00] INFO: Processing execute[copy_instance_type_data_from_s3] action run (/etc/chef/local-mode-cache/cache/cookbooks/aws-parallelcluster-config/resources/fetch_config.rb line 53) | |
[execute] { | |
"AcceptRanges": "bytes", | |
"LastModified": "Thu, 30 Jun 2022 15:04:04 GMT", | |
"ContentLength": 2894, | |
"ETag": "\"fa59c29b20615569e2fb5182bb7fc4d9\"", | |
"VersionId": "ZCV8h_S0EgcsxtoIbRgX5frTWNP6Ai1x", | |
"ContentType": "binary/octet-stream", | |
"ServerSideEncryption": "AES256", | |
"Metadata": {} | |
} | |
[2022-06-30T15:08:59+00:00] INFO: execute[copy_instance_type_data_from_s3] ran successfully | |
- execute /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/aws s3api get-object --bucket parallelcluster-2e5bf78a0b005f30-v1-do-not-delete --key parallelcluster/3.1.4/clusters/compute1-9k2cwodproy9ug0m/configs/instance-types-data.json --region us-east-2 /opt/parallelcluster/shared/instance-types-data.json | |
* ruby_block[load cluster configuration] action run[2022-06-30T15:08:59+00:00] INFO: Processing ruby_block[load cluster configuration] action run (/etc/chef/local-mode-cache/cache/cookbooks/aws-parallelcluster/libraries/helpers.rb line 537) | |
[2022-06-30T15:08:59+00:00] INFO: ruby_block[load cluster configuration] called | |
- execute the ruby block load cluster configuration | |
Recipe: aws-parallelcluster-config::cloudwatch_agent | |
* cookbook_file[write_cloudwatch_agent_json.py] action create[2022-06-30T15:08:59+00:00] INFO: Processing cookbook_file[write_cloudwatch_agent_json.py] action create (aws-parallelcluster-config::cloudwatch_agent line 19) | |
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[write_cloudwatch_agent_json.py] created file /usr/local/bin/write_cloudwatch_agent_json.py | |
- create new file /usr/local/bin/write_cloudwatch_agent_json.py[2022-06-30T15:08:59+00:00] INFO: cookbook_file[write_cloudwatch_agent_json.py] updated file contents /usr/local/bin/write_cloudwatch_agent_json.py | |
- update content in file /usr/local/bin/write_cloudwatch_agent_json.py from none to 056bb9 | |
--- /usr/local/bin/write_cloudwatch_agent_json.py 2022-06-30 15:08:59.290536486 +0000 | |
+++ /usr/local/bin/.chef-write_cloudwatch_agent_json20220630-1116-4ghn0n.py 2022-06-30 15:08:59.286536388 +0000 | |
@@ -1 +1,217 @@ | |
+#!/usr/bin/env python | |
+""" | |
+Write the CloudWatch agent configuration file. | |
+ | |
+Write the JSON used to configure the CloudWatch agent on an instance conditional | |
+on the scheduler to be used, the platform (OS family) in use and the instance's role in the cluster. | |
+""" | |
+ | |
+import argparse | |
+import json | |
+import os | |
+import socket | |
+ | |
+import yaml | |
+ | |
+AWS_CLOUDWATCH_CFG_PATH = "/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json" | |
+ | |
+ | |
+def parse_args(): | |
+ """Parse CL args and return an argparse.Namespace.""" | |
+ parser = argparse.ArgumentParser(description="Create the cloudwatch agent config file") | |
+ parser.add_argument("--config", help="Path to JSON file describing logs that should be monitored", required=True) | |
+ parser.add_argument( | |
+ "--platform", help="OS family of this instance", choices=["amazon", "centos", "ubuntu"], required=True | |
+ ) | |
+ parser.add_argument("--log-group", help="Name of the log group", required=True) | |
+ parser.add_argument( | |
+ "--node-role", | |
+ required=True, | |
+ choices=["HeadNode", "ComputeFleet"], | |
+ help="Role this node plays in the cluster " "(i.e., is it a compute node or the head node?)", | |
+ ) | |
+ parser.add_argument("--scheduler", required=True, choices=["slurm", "awsbatch", "plugin"], help="Scheduler") | |
+ parser.add_argument( | |
+ "--cluster-config-path", | |
+ required=False, | |
+ help="Cluster configuration path", | |
+ ) | |
+ return parser.parse_args() | |
+ | |
+ | |
+def gethostname(): | |
+ """Return hostname of this instance.""" | |
+ return socket.gethostname().split(".")[0] | |
+ | |
+ | |
+def write_config(config): | |
+ """Write config to AWS_CLOUDWATCH_CFG_PATH.""" | |
+ with open(AWS_CLOUDWATCH_CFG_PATH, "w+") as output_config_file: | |
+ json.dump(config, output_config_file, indent=4) | |
+ | |
+ | |
+def add_log_group_name_params(log_group_name, configs): | |
+ """Add a "log_group_name": log_group_name to every config.""" | |
+ for config in configs: | |
+ config.update({"log_group_name": log_group_name}) | |
+ return configs | |
+ | |
+ | |
+def add_instance_log_stream_prefixes(configs): | |
+ """Prefix all log_stream_name fields with instance identifiers.""" | |
+ for config in configs: | |
+ config["log_stream_name"] = "{host}.{{instance_id}}.{log_stream_name}".format( | |
+ host=gethostname(), log_stream_name=config["log_stream_name"] | |
+ ) | |
+ return configs | |
+ | |
+ | |
+def read_data(config_path): | |
+ """Read in log configuration data from config_path.""" | |
+ with open(config_path) as infile: | |
+ return json.load(infile) | |
+ | |
+ | |
+def select_configs_for_scheduler(configs, scheduler): | |
+ """Filter out from configs those entries whose 'schedulers' list does not contain scheduler.""" | |
+ return [config for config in configs if scheduler in config["schedulers"]] | |
+ | |
+ | |
+def select_configs_for_node_role(configs, node_role): | |
+ """Filter out from configs those entries whose 'node_roles' list does not contain node_role.""" | |
+ return [config for config in configs if node_role in config["node_roles"]] | |
+ | |
+ | |
+def select_configs_for_platform(configs, platform): | |
+ """Filter out from configs those entries whose 'platforms' list does not contain platform.""" | |
+ return [config for config in configs if platform in config["platforms"]] | |
+ | |
+ | |
+def get_node_info(): | |
+ """Return the information encoded in the JSON file at /etc/chef/dna.json.""" | |
+ node_info = {} | |
+ dna_path = "/etc/chef/dna.json" | |
+ if os.path.isfile(dna_path): | |
+ with open(dna_path) as node_info_file: | |
+ node_info = json.load(node_info_file).get("cluster") | |
+ return node_info | |
+ | |
+ | |
+def select_configs_for_feature(configs): | |
+ """Filter out from configs those entries whose 'feature_conditions' list contains an unsatisfied entry.""" | |
+ selected_configs = [] | |
+ node_info = get_node_info() | |
+ for config in configs: | |
+ conditions = config.get("feature_conditions", []) | |
+ for condition in conditions: | |
+ dna_keys = condition.get("dna_key") | |
+ if isinstance(dna_keys, str): # dna_key can be a string for single level dict or a list for nested dicts | |
+ dna_keys = [dna_keys] | |
+ value = node_info | |
+ for key in dna_keys: | |
+ value = value.get(key) | |
+ if value is None: | |
+ break | |
+ if value not in condition.get("satisfying_values"): | |
+ break | |
+ else: | |
+ selected_configs.append(config) | |
+ return selected_configs | |
+ | |
+ | |
+def select_logs(configs, args): | |
+ """Select the appropriate set of log configs.""" | |
+ selected_configs = select_configs_for_scheduler(configs, args.scheduler) | |
+ selected_configs = select_configs_for_node_role(selected_configs, args.node_role) | |
+ selected_configs = select_configs_for_platform(selected_configs, args.platform) | |
+ selected_configs = select_configs_for_feature(selected_configs) | |
+ return selected_configs | |
+ | |
+ | |
+def get_node_roles(scheudler_plugin_node_roles): | |
+ node_type_roles_map = {"ALL": ["ComputeFleet", "HeadNode"], "HEAD": ["HeadNode"], "COMPUTE": ["ComputeFleet"]} | |
+ return node_type_roles_map.get(scheudler_plugin_node_roles) | |
+ | |
+ | |
+def load_config(cluster_config_path): | |
+ with open(cluster_config_path) as input_file: | |
+ return yaml.load(input_file, Loader=yaml.SafeLoader) | |
+ | |
+ | |
+def add_scheduler_plugin_log(config_data, cluster_config_path): | |
+ """Add custom log files to config data if log files specified in scheduler plugin.""" | |
+ cluster_config = load_config(cluster_config_path) | |
+ if ( | |
+ get_dict_value(cluster_config, "Scheduling.SchedulerSettings.SchedulerDefinition.Monitoring.Logs.Files") | |
+ and get_dict_value(cluster_config, "Scheduling.Scheduler") == "plugin" | |
+ ): | |
+ log_files = get_dict_value( | |
+ cluster_config, "Scheduling.SchedulerSettings.SchedulerDefinition.Monitoring.Logs.Files" | |
+ ) | |
+ for log_file in log_files: | |
+ # Add log config | |
+ log_config = { | |
+ "timestamp_format_key": log_file.get("LogStreamName"), | |
+ "file_path": log_file.get("FilePath"), | |
+ "log_stream_name": log_file.get("LogStreamName"), | |
+ "schedulers": ["plugin"], | |
+ "platforms": ["centos", "ubuntu", "amazon"], | |
+ "node_roles": get_node_roles(log_file.get("NodeType")), | |
+ "feature_conditions": [], | |
+ } | |
+ config_data["log_configs"].append(log_config) | |
+ | |
+ # Add timestamp formats | |
+ config_data["timestamp_formats"][log_file.get("LogStreamName")] = log_file.get("TimestampFormat") | |
+ return config_data | |
+ | |
+ | |
+def add_timestamps(configs, timestamps_dict): | |
+ """For each config, set its timestamp_format field based on its timestamp_format_key field.""" | |
+ for config in configs: | |
+ config["timestamp_format"] = timestamps_dict[config["timestamp_format_key"]] | |
+ return configs | |
+ | |
+ | |
+def filter_output_fields(configs): | |
+ """Remove fields that are not required by CloudWatch agent config file.""" | |
+ desired_keys = ["log_stream_name", "file_path", "timestamp_format", "log_group_name"] | |
+ return [{desired_key: config[desired_key] for desired_key in desired_keys} for config in configs] | |
+ | |
+ | |
+def create_config(log_configs): | |
+ """Return a dict representing the structure of the output JSON.""" | |
+ return { | |
+ "logs": { | |
+ "logs_collected": {"files": {"collect_list": log_configs}}, | |
+ "log_stream_name": "{host}.{{instance_id}}.default-log-stream".format(host=gethostname()), | |
+ } | |
+ } | |
+ | |
+ | |
+def get_dict_value(value, attributes, default=None): | |
+ """Get key value from dictionary and return default if the key does not exist.""" | |
+ for key in attributes.split("."): | |
+ value = value.get(key, None) | |
+ if value is None: | |
+ return default | |
+ return value | |
+ | |
+ | |
+def main(): | |
+ """Create cloudwatch agent config file.""" | |
+ args = parse_args() | |
+ config_data = read_data(args.config) | |
+ if args.cluster_config_path: | |
+ config_data = add_scheduler_plugin_log(config_data, args.cluster_config_path) | |
+ log_configs = select_logs(config_data["log_configs"], args) | |
+ log_configs = add_timestamps(log_configs, config_data["timestamp_formats"]) | |
+ log_configs = add_log_group_name_params(args.log_group, log_configs) | |
+ log_configs = add_instance_log_stream_prefixes(log_configs) | |
+ log_configs = filter_output_fields(log_configs) | |
+ write_config(create_config(log_configs)) | |
+ | |
+ | |
+if __name__ == "__main__": | |
+ main()[2022-06-30T15:08:59+00:00] INFO: cookbook_file[write_cloudwatch_agent_json.py] owner changed to 0 | |
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[write_cloudwatch_agent_json.py] group changed to 0 | |
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[write_cloudwatch_agent_json.py] mode changed to 755 | |
- change mode from '' to '0755' | |
- change owner from '' to 'root' | |
- change group from '' to 'root' | |
* cookbook_file[cloudwatch_log_files.json] action create[2022-06-30T15:08:59+00:00] INFO: Processing cookbook_file[cloudwatch_log_files.json] action create (aws-parallelcluster-config::cloudwatch_agent line 29) | |
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files.json] created file /usr/local/etc/cloudwatch_log_files.json | |
- create new file /usr/local/etc/cloudwatch_log_files.json[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files.json] updated file contents /usr/local/etc/cloudwatch_log_files.json | |
- update content in file /usr/local/etc/cloudwatch_log_files.json from none to 146300 | |
--- /usr/local/etc/cloudwatch_log_files.json 2022-06-30 15:08:59.338537669 +0000 | |
+++ /usr/local/etc/.chef-cloudwatch_log_files20220630-1116-q8skll.json 2022-06-30 15:08:59.338537669 +0000 | |
@@ -1 +1,550 @@ | |
+{ | |
+ "timestamp_formats": { | |
+ "month_first": "%b %-d %H:%M:%S", | |
+ "default": "%Y-%m-%d %H:%M:%S,%f", | |
+ "bracket_default": "[%Y-%m-%d %H:%M:%S]", | |
+ "slurm": "%Y-%m-%dT%H:%M:%S.%f" | |
+ }, | |
+ "log_configs": [ | |
+ { | |
+ "timestamp_format_key": "month_first", | |
+ "file_path": "/var/log/messages", | |
+ "log_stream_name": "system-messages", | |
+ "schedulers": [ | |
+ "awsbatch", | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "amazon", | |
+ "centos" | |
+ ], | |
+ "node_roles": [ | |
+ "ComputeFleet", | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "month_first", | |
+ "file_path": "/var/log/syslog", | |
+ "log_stream_name": "syslog", | |
+ "schedulers": [ | |
+ "awsbatch", | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "ubuntu" | |
+ ], | |
+ "node_roles": [ | |
+ "ComputeFleet", | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/cfn-init.log", | |
+ "log_stream_name": "cfn-init", | |
+ "schedulers": [ | |
+ "awsbatch", | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "amazon", | |
+ "centos", | |
+ "ubuntu" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/chef-client.log", | |
+ "log_stream_name": "chef-client", | |
+ "schedulers": [ | |
+ "awsbatch", | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "amazon", | |
+ "centos", | |
+ "ubuntu" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/cloud-init.log", | |
+ "log_stream_name": "cloud-init", | |
+ "schedulers": [ | |
+ "awsbatch", | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "amazon", | |
+ "centos", | |
+ "ubuntu" | |
+ ], | |
+ "node_roles": [ | |
+ "ComputeFleet", | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/cloud-init-output.log", | |
+ "log_stream_name": "cloud-init-output", | |
+ "schedulers": [ | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "amazon", | |
+ "centos", | |
+ "ubuntu" | |
+ ], | |
+ "node_roles": [ | |
+ "ComputeFleet" | |
+ ], | |
+ "feature_conditions": [] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/supervisord.log", | |
+ "log_stream_name": "supervisord", | |
+ "schedulers": [ | |
+ "awsbatch", | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "amazon", | |
+ "centos", | |
+ "ubuntu" | |
+ ], | |
+ "node_roles": [ | |
+ "ComputeFleet", | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/parallelcluster/clustermgtd", | |
+ "log_stream_name": "clustermgtd", | |
+ "schedulers": [ | |
+ "slurm" | |
+ ], | |
+ "platforms": [ | |
+ "amazon", | |
+ "centos", | |
+ "ubuntu" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/parallelcluster/computemgtd", | |
+ "log_stream_name": "computemgtd", | |
+ "schedulers": [ | |
+ "slurm" | |
+ ], | |
+ "platforms": [ | |
+ "amazon", | |
+ "centos", | |
+ "ubuntu" | |
+ ], | |
+ "node_roles": [ | |
+ "ComputeFleet" | |
+ ], | |
+ "feature_conditions": [] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/parallelcluster/slurm_resume.log", | |
+ "log_stream_name": "slurm_resume", | |
+ "schedulers": [ | |
+ "slurm" | |
+ ], | |
+ "platforms": [ | |
+ "amazon", | |
+ "centos", | |
+ "ubuntu" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/parallelcluster/slurm_suspend.log", | |
+ "log_stream_name": "slurm_suspend", | |
+ "schedulers": [ | |
+ "slurm" | |
+ ], | |
+ "platforms": [ | |
+ "amazon", | |
+ "centos", | |
+ "ubuntu" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "slurm", | |
+ "file_path": "/var/log/slurmd.log", | |
+ "log_stream_name": "slurmd", | |
+ "schedulers": [ | |
+ "slurm" | |
+ ], | |
+ "platforms": [ | |
+ "amazon", | |
+ "centos", | |
+ "ubuntu" | |
+ ], | |
+ "node_roles": [ | |
+ "ComputeFleet" | |
+ ], | |
+ "feature_conditions": [] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "slurm", | |
+ "file_path": "/var/log/slurmctld.log", | |
+ "log_stream_name": "slurmctld", | |
+ "schedulers": [ | |
+ "slurm" | |
+ ], | |
+ "platforms": [ | |
+ "amazon", | |
+ "centos", | |
+ "ubuntu" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/parallelcluster/pcluster_dcv_authenticator.log", | |
+ "log_stream_name": "dcv-authenticator", | |
+ "schedulers": [ | |
+ "awsbatch", | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "centos", | |
+ "ubuntu", | |
+ "amazon" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [ | |
+ { | |
+ "dna_key": "dcv_enabled", | |
+ "satisfying_values": ["head_node"] | |
+ } | |
+ ] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/sssd/sssd.log", | |
+ "log_stream_name": "sssd", | |
+ "schedulers": [ | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "centos", | |
+ "ubuntu", | |
+ "amazon" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [ | |
+ { | |
+ "dna_key": [ | |
+ "directory_service", | |
+ "enabled" | |
+ ], | |
+ "satisfying_values": ["true"] | |
+ } | |
+ ] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/sssd/sssd_default.log", | |
+ "log_stream_name": "sssd_domain_default", | |
+ "schedulers": [ | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "centos", | |
+ "ubuntu", | |
+ "amazon" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [ | |
+ { | |
+ "dna_key": [ | |
+ "directory_service", | |
+ "enabled" | |
+ ], | |
+ "satisfying_values": ["true"] | |
+ } | |
+ ] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/parallelcluster/pam_ssh_key_generator.log", | |
+ "log_stream_name": "pam_ssh_key_generator", | |
+ "schedulers": [ | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "centos", | |
+ "ubuntu", | |
+ "amazon" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [ | |
+ { | |
+ "dna_key": [ | |
+ "directory_service", | |
+ "generate_ssh_keys_for_users" | |
+ ], | |
+ "satisfying_values": ["true"] | |
+ } | |
+ ] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "bracket_default", | |
+ "file_path": "/var/log/parallelcluster/pcluster_dcv_connect.log", | |
+ "log_stream_name": "dcv-ext-authenticator", | |
+ "schedulers": [ | |
+ "awsbatch", | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "centos", | |
+ "ubuntu", | |
+ "amazon" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [ | |
+ { | |
+ "dna_key": "dcv_enabled", | |
+ "satisfying_values": ["head_node"] | |
+ } | |
+ ] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/dcv/server.log", | |
+ "log_stream_name": "dcv-server", | |
+ "schedulers": [ | |
+ "awsbatch", | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "centos", | |
+ "ubuntu", | |
+ "amazon" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [ | |
+ { | |
+ "dna_key": "dcv_enabled", | |
+ "satisfying_values": ["head_node"] | |
+ } | |
+ ] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/dcv/sessionlauncher.log", | |
+ "log_stream_name": "dcv-session-launcher", | |
+ "schedulers": [ | |
+ "awsbatch", | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "centos", | |
+ "ubuntu", | |
+ "amazon" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [ | |
+ { | |
+ "dna_key": "dcv_enabled", | |
+ "satisfying_values": ["head_node"] | |
+ } | |
+ ] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/dcv/agent.*.log", | |
+ "log_stream_name": "dcv-agent", | |
+ "schedulers": [ | |
+ "awsbatch", | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "centos", | |
+ "ubuntu", | |
+ "amazon" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [ | |
+ { | |
+ "dna_key": "dcv_enabled", | |
+ "satisfying_values": ["head_node"] | |
+ } | |
+ ] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/dcv/dcv-xsession.*.log", | |
+ "log_stream_name": "dcv-xsession", | |
+ "schedulers": [ | |
+ "awsbatch", | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "centos", | |
+ "ubuntu", | |
+ "amazon" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [ | |
+ { | |
+ "dna_key": "dcv_enabled", | |
+ "satisfying_values": ["head_node"] | |
+ } | |
+ ] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/dcv/Xdcv.*.log", | |
+ "log_stream_name": "Xdcv", | |
+ "schedulers": [ | |
+ "awsbatch", | |
+ "slurm", | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "centos", | |
+ "ubuntu", | |
+ "amazon" | |
+ ], | |
+ "node_roles": [ | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [ | |
+ { | |
+ "dna_key": "dcv_enabled", | |
+ "satisfying_values": ["head_node"] | |
+ } | |
+ ] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/parallelcluster/scheduler-plugin.out.log", | |
+ "log_stream_name": "scheduler-plugin-out", | |
+ "schedulers": [ | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "centos", | |
+ "ubuntu", | |
+ "amazon" | |
+ ], | |
+ "node_roles": [ | |
+ "ComputeFleet", | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/parallelcluster/scheduler-plugin.err.log", | |
+ "log_stream_name": "scheduler-plugin-err", | |
+ "schedulers": [ | |
+ "plugin" | |
+ ], | |
+ "platforms": [ | |
+ "centos", | |
+ "ubuntu", | |
+ "amazon" | |
+ ], | |
+ "node_roles": [ | |
+ "ComputeFleet", | |
+ "HeadNode" | |
+ ], | |
+ "feature_conditions": [] | |
+ }, | |
+ { | |
+ "timestamp_format_key": "default", | |
+ "file_path": "/var/log/parallelcluster/slurm_prolog_epilog.log", | |
+ "log_stream_name": "slurm_prolog_epilog", | |
+ "schedulers": [ | |
+ "slurm" | |
+ ], | |
+ "platforms": [ | |
+ "centos", | |
+ "ubuntu", | |
+ "amazon" | |
+ ], | |
+ "node_roles": [ | |
+ "ComputeFleet" | |
+ ], | |
+ "feature_conditions": [ | |
+ { | |
+ "dna_key": "use_private_hostname", | |
+ "satisfying_values": ["true"] | |
+ } | |
+ ] | |
+ } | |
+ ] | |
+}[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files.json] owner changed to 0 | |
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files.json] group changed to 0 | |
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files.json] mode changed to 644 | |
- change mode from '' to '0644' | |
- change owner from '' to 'root' | |
- change group from '' to 'root' | |
* cookbook_file[cloudwatch_log_files_schema.json] action create[2022-06-30T15:08:59+00:00] INFO: Processing cookbook_file[cloudwatch_log_files_schema.json] action create (aws-parallelcluster-config::cloudwatch_agent line 39) | |
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files_schema.json] created file /usr/local/etc/cloudwatch_log_files_schema.json | |
- create new file /usr/local/etc/cloudwatch_log_files_schema.json[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files_schema.json] updated file contents /usr/local/etc/cloudwatch_log_files_schema.json | |
- update content in file /usr/local/etc/cloudwatch_log_files_schema.json from none to 0a6242 | |
--- /usr/local/etc/cloudwatch_log_files_schema.json 2022-06-30 15:08:59.406539343 +0000 | |
+++ /usr/local/etc/.chef-cloudwatch_log_files_schema20220630-1116-hhjy0i.json 2022-06-30 15:08:59.406539343 +0000 | |
@@ -1 +1,51 @@ | |
+{ | |
+ "type": "object", | |
+ "properties": { | |
+ "timestamp_formats": {"type": "object"}, | |
+ "log_configs": { | |
+ "type": "array", | |
+ "items": { | |
+ "type": "object", | |
+ "properties": { | |
+ "timestamp_format_key": {"type": "string"}, | |
+ "file_path": {"type": "string"}, | |
+ "log_stream_name": {"type": "string"}, | |
+ "schedulers": { | |
+ "type": "array", | |
+ "items": {"type": "string", "enum": ["awsbatch", "slurm", "plugin"]} | |
+ }, | |
+ "platforms": { | |
+ "type": "array", | |
+ "items": {"type": "string", "enum": ["amazon", "centos", "ubuntu"]} | |
+ }, | |
+ "node_roles": { | |
+ "type": "array", | |
+ "items": {"type": "string", "enum": ["HeadNode", "ComputeFleet"]} | |
+ }, | |
+ "feature_conditions": { | |
+ "type": "array", | |
+ "items": { | |
+ "type": "object", | |
+ "properties": { | |
+ "dna_key": {"type": ["string", "array"]}, | |
+ "satisfying_values": {"type": "array", "items": {"type": "string"}} | |
+ }, | |
+ "required": ["dna_key", "satisfying_values"] | |
+ } | |
+ } | |
+ }, | |
+ "required": [ | |
+ "node_roles", | |
+ "platforms", | |
+ "schedulers", | |
+ "log_stream_name", | |
+ "file_path", | |
+ "timestamp_format_key", | |
+ "feature_conditions" | |
+ ] | |
+ } | |
+ } | |
+ }, | |
+ "required": ["timestamp_formats", "log_configs"] | |
+}[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files_schema.json] owner changed to 0 | |
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files_schema.json] group changed to 0 | |
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_files_schema.json] mode changed to 644 | |
- change mode from '' to '0644' | |
- change owner from '' to 'root' | |
- change group from '' to 'root' | |
* cookbook_file[cloudwatch_log_configs_util.py] action create[2022-06-30T15:08:59+00:00] INFO: Processing cookbook_file[cloudwatch_log_configs_util.py] action create (aws-parallelcluster-config::cloudwatch_agent line 49) | |
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_configs_util.py] created file /usr/local/bin/cloudwatch_log_configs_util.py | |
- create new file /usr/local/bin/cloudwatch_log_configs_util.py[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_configs_util.py] updated file contents /usr/local/bin/cloudwatch_log_configs_util.py | |
- update content in file /usr/local/bin/cloudwatch_log_configs_util.py from none to bc8f92 | |
--- /usr/local/bin/cloudwatch_log_configs_util.py 2022-06-30 15:08:59.454540525 +0000 | |
+++ /usr/local/bin/.chef-cloudwatch_log_configs_util20220630-1116-h3x11d.py 2022-06-30 15:08:59.454540525 +0000 | |
@@ -1 +1,189 @@ | |
+""" | |
+Validate and modify the data in the cloudwatch_log_files.json cookbook file. | |
+ | |
+This file is used to validate and add data to the JSON file that's used to | |
+configure the CloudWatch agent on a cluster's EC2 instances. The structure of | |
+the new and/or existing data is validated in the following ways: | |
+* jsonschema is used to ensure that the input and output configs both possess | |
+ a valid structure. See cloudwatch_log_files_schema.json for the schema. | |
+* For each log_configs entry, it's verified that its timestamp_key is a valid | |
+ key into the same config file's timestamp_formats object. | |
+* It's verified that all log_configs entries have unique values for their | |
+ log_stream_name and file_path attributes. | |
+""" | |
+ | |
+import argparse | |
+import collections | |
+import json | |
+import os | |
+import shutil | |
+import sys | |
+ | |
+import jsonschema | |
+ | |
+DEFAULT_SCHEMA_PATH = os.path.realpath(os.path.join(os.path.curdir, "cloudwatch_log_files_schema.json")) | |
+SCHEMA_PATH = os.environ.get("CW_LOGS_CONFIGS_SCHEMA_PATH", DEFAULT_SCHEMA_PATH) | |
+DEFAULT_LOG_CONFIGS_PATH = os.path.realpath(os.path.join(os.path.curdir, "cloudwatch_log_files.json")) | |
+LOG_CONFIGS_PATH = os.environ.get("CW_LOGS_CONFIGS_PATH", DEFAULT_LOG_CONFIGS_PATH) | |
+LOG_CONFIGS_BAK_PATH = "{}.bak".format(LOG_CONFIGS_PATH) | |
+ | |
+ | |
+def _fail(message): | |
+ """Exit nonzero with the given error message.""" | |
+ sys.exit(message) | |
+ | |
+ | |
+def parse_args(): | |
+ """Parse command line args.""" | |
+ parser = argparse.ArgumentParser( | |
+ description="Validate of add new CloudWatch log configs.", | |
+ epilog="If neither --input-json nor --input-file are used, this script will validate the existing config.", | |
+ ) | |
+ add_group = parser.add_mutually_exclusive_group() | |
+ add_group.add_argument( | |
+ "--input-file", type=argparse.FileType("r"), help="Path to file containing configs for log files to add." | |
+ ) | |
+ add_group.add_argument("--input-json", type=json.loads, help="String containing configs for log files to add.") | |
+ return parser.parse_args() | |
+ | |
+ | |
+def get_input_json(args): | |
+ """Either load the input JSON data from a file, or returned the JSON parsed on the CLI.""" | |
+ if args.input_file: | |
+ with args.input_file: | |
+ return json.load(args.input_file) | |
+ else: | |
+ return args.input_json | |
+ | |
+ | |
+def _read_json_at(path): | |
+ """Read the JSON file at path.""" | |
+ try: | |
+ with open(path) as input_file: | |
+ return json.load(input_file) | |
+ except FileNotFoundError: | |
+ _fail("No file exists at {}".format(path)) | |
+ except ValueError: | |
+ _fail("File at {} contains invalid JSON".format(path)) | |
+ | |
+ | |
+def _read_schema(): | |
+ """Read the schema for the CloudWatch log configs file.""" | |
+ return _read_json_at(SCHEMA_PATH) | |
+ | |
+ | |
+def _read_log_configs(): | |
+ """Read the current version of the CloudWatch log configs file, cloudwatch_log_files.json.""" | |
+ return _read_json_at(LOG_CONFIGS_PATH) | |
+ | |
+ | |
+def _validate_json_schema(input_json): | |
+ """Ensure the structure of input_json matches the schema.""" | |
+ schema = _read_schema() | |
+ try: | |
+ jsonschema.validate(input_json, schema) | |
+ except jsonschema.exceptions.ValidationError as validation_err: | |
+ _fail(str(validation_err)) | |
+ | |
+ | |
+def _validate_timestamp_keys(input_json): | |
+ """Ensure the timestamp_format_key values in input_json's log_configs entries are valid.""" | |
+ valid_keys = set() | |
+ for config in (input_json, _read_log_configs()): | |
+ valid_keys |= set(config.get("timestamp_formats").keys()) | |
+ for log_config in input_json.get("log_configs"): | |
+ if log_config.get("timestamp_format_key") not in valid_keys: | |
+ _fail( | |
+ "Log config with log_stream_name {log_stream_name} and file_path {file_path} contains an invalid " | |
+ "timestamp_format_key: {timestamp_format_key}. Valid values are {valid_keys}".format( | |
+ log_stream_name=log_config.get("log_stream_name"), | |
+ file_path=log_config.get("file_path"), | |
+ timestamp_format_key=log_config.get("timestamp_format_key"), | |
+ valid_keys=", ".join(valid_keys), | |
+ ) | |
+ ) | |
+ | |
+ | |
+def _get_duplicate_values(seq): | |
+ """Get the duplicate values in seq.""" | |
+ counter = collections.Counter(seq) | |
+ return [value for value, count in counter.items() if count > 1] | |
+ | |
+ | |
+def _validate_log_config_fields_uniqueness(input_json): | |
+ """Ensure that each entry in input_json's log_configs list has a unique log_stream_name and file_path.""" | |
+ unique_fields = ("log_stream_name", "file_path") | |
+ for field in unique_fields: | |
+ duplicates = _get_duplicate_values([config.get(field) for config in input_json.get("log_configs")]) | |
+ if duplicates: | |
+ _fail( | |
+ "The following {field} values are used multiple times: {duplicates}".format( | |
+ field=field, duplicates=", ".join(duplicates) | |
+ ) | |
+ ) | |
+ | |
+ | |
+def validate_json(input_json=None): | |
+ """Ensure the structure of input_json matches that of the file it will be added to.""" | |
+ if input_json is None: | |
+ input_json = _read_log_configs() | |
+ _validate_json_schema(input_json) | |
+ _validate_timestamp_keys(input_json) | |
+ _validate_log_config_fields_uniqueness(input_json) | |
+ | |
+ | |
+def _write_log_configs(log_configs): | |
+ """Write log_configs back to the CloudWatch log configs file.""" | |
+ log_configs_path = os.environ.get("CW_LOGS_CONFIGS_PATH", DEFAULT_LOG_CONFIGS_PATH) | |
+ with open(log_configs_path, "w") as log_configs_file: | |
+ json.dump(log_configs, log_configs_file, indent=2) | |
+ | |
+ | |
+def write_validated_json(input_json): | |
+ """Write validated JSON back to the CloudWatch log configs file.""" | |
+ log_configs = _read_log_configs() | |
+ log_configs["log_configs"].extend(input_json.get("log_configs")) | |
+ | |
+ # NOTICE: the input JSON's timestamp_formats dict is the one that is | |
+ # updated, so that those defined in the original config aren't clobbered. | |
+ log_configs["timestamp_formats"] = input_json["timestamp_formats"].update(log_configs.get("timestamp_formats")) | |
+ _write_log_configs(log_configs) | |
+ | |
+ | |
+def create_backup(): | |
+ """Create a backup of the file at LOG_CONFIGS_PATH.""" | |
+ shutil.copyfile(LOG_CONFIGS_PATH, LOG_CONFIGS_BAK_PATH) | |
+ | |
+ | |
+def restore_backup(): | |
+ """Replace the file at LOG_CONFIGS_PATH with the backup that was created in create_backup.""" | |
+ shutil.move(LOG_CONFIGS_BAK_PATH, LOG_CONFIGS_PATH) | |
+ | |
+ | |
+def remove_backup(): | |
+ """Remove the backup created by create_backup.""" | |
+ try: | |
+ os.remove(LOG_CONFIGS_BAK_PATH) | |
+ except FileNotFoundError: | |
+ pass | |
+ | |
+ | |
+def main(): | |
+ """Run the script.""" | |
+ args = parse_args() | |
+ create_backup() | |
+ try: | |
+ if args.input_file or args.input_json: | |
+ input_json = get_input_json(args) | |
+ validate_json(input_json) | |
+ write_validated_json(input_json) | |
+ validate_json() | |
+ except Exception: | |
+ restore_backup() | |
+ finally: | |
+ remove_backup() | |
+ | |
+ | |
+if __name__ == "__main__": | |
+ main()[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_configs_util.py] owner changed to 0 | |
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_configs_util.py] group changed to 0 | |
[2022-06-30T15:08:59+00:00] INFO: cookbook_file[cloudwatch_log_configs_util.py] mode changed to 644 | |
- change mode from '' to '0644' | |
- change owner from '' to 'root' | |
- change group from '' to 'root' | |
* execute[cloudwatch-config-validation] action run[2022-06-30T15:08:59+00:00] INFO: Processing execute[cloudwatch-config-validation] action run (aws-parallelcluster-config::cloudwatch_agent line 58) | |
[2022-06-30T15:09:00+00:00] INFO: execute[cloudwatch-config-validation] ran successfully | |
- execute /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/bin/cloudwatch_log_configs_util.py | |
* execute[cloudwatch-config-creation] action run[2022-06-30T15:09:00+00:00] INFO: Processing execute[cloudwatch-config-creation] action run (aws-parallelcluster-config::cloudwatch_agent line 67) | |
[2022-06-30T15:09:00+00:00] INFO: execute[cloudwatch-config-creation] ran successfully | |
- execute /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/bin/write_cloudwatch_agent_json.py --platform ubuntu --config $CONFIG_DATA_PATH --log-group $LOG_GROUP_NAME --scheduler $SCHEDULER --node-role $NODE_ROLE | |
* execute[cloudwatch-agent-start] action run[2022-06-30T15:09:00+00:00] INFO: Processing execute[cloudwatch-agent-start] action run (aws-parallelcluster-config::cloudwatch_agent line 84) | |
[execute] ****** processing amazon-cloudwatch-agent ****** | |
/opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source file:/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default | |
2022/06/30 15:09:01 D! [EC2] Found active network interface | |
Successfully fetched the config and saved in /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_amazon-cloudwatch-agent.json.tmp | |
Start configuration validation... | |
/opt/aws/amazon-cloudwatch-agent/bin/config-translator --input /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json --input-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --output /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default | |
2022/06/30 15:09:03 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_amazon-cloudwatch-agent.json.tmp ... | |
2022/06/30 15:09:03 I! Valid Json input schema. | |
I! Detecting run_as_user... | |
2022/06/30 15:09:03 D! [EC2] Found active network interface | |
No csm configuration found. | |
No metric configuration found. | |
Configuration validation first phase succeeded | |
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -schematest -config /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml | |
Configuration validation second phase succeeded | |
Configuration validation succeeded | |
amazon-cloudwatch-agent has already been stopped | |
Created symlink /etc/systemd/system/multi-user.target.wants/amazon-cloudwatch-agent.service → /etc/systemd/system/amazon-cloudwatch-agent.service. | |
[2022-06-30T15:09:11+00:00] INFO: execute[cloudwatch-agent-start] ran successfully | |
- execute /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json -s | |
Recipe: aws-parallelcluster-config::network_interfaces | |
* log[macs: ["02:4c:91:fd:c4:f4"]] action write[2022-06-30T15:09:11+00:00] INFO: Processing log[macs: ["02:4c:91:fd:c4:f4"]] action write (aws-parallelcluster-config::network_interfaces line 63) | |
[2022-06-30T15:09:11+00:00] INFO: macs: ["02:4c:91:fd:c4:f4"] | |
Recipe: aws-parallelcluster-slurm::init | |
* directory[/etc/parallelcluster/slurm_plugin] action create[2022-06-30T15:09:11+00:00] INFO: Processing directory[/etc/parallelcluster/slurm_plugin] action create (aws-parallelcluster-slurm::init line 20) | |
[2022-06-30T15:09:11+00:00] INFO: directory[/etc/parallelcluster/slurm_plugin] created directory /etc/parallelcluster/slurm_plugin | |
- create new directory /etc/parallelcluster/slurm_plugin[2022-06-30T15:09:11+00:00] INFO: directory[/etc/parallelcluster/slurm_plugin] owner changed to 0 | |
[2022-06-30T15:09:11+00:00] INFO: directory[/etc/parallelcluster/slurm_plugin] group changed to 0 | |
[2022-06-30T15:09:11+00:00] INFO: directory[/etc/parallelcluster/slurm_plugin] mode changed to 755 | |
- change mode from '' to '0755' | |
- change owner from '' to 'root' | |
- change group from '' to 'root' | |
Recipe: aws-parallelcluster-slurm::init_dns | |
* replace_or_add[append Route53 search domain in /etc/systemd/resolved.conf] action edit[2022-06-30T15:09:11+00:00] INFO: Processing replace_or_add[append Route53 search domain in /etc/systemd/resolved.conf] action edit (aws-parallelcluster-slurm::init_dns line 31) | |
* file[/etc/systemd/resolved.conf] action create[2022-06-30T15:09:11+00:00] INFO: Processing file[/etc/systemd/resolved.conf] action create (/etc/chef/local-mode-cache/cache/cookbooks/line/resources/replace_or_add.rb line 41) | |
[2022-06-30T15:09:11+00:00] INFO: file[/etc/systemd/resolved.conf] updated file contents /etc/systemd/resolved.conf | |
- update content in file /etc/systemd/resolved.conf from e12793 to 530e52 | |
- suppressed sensitive resource | |
* service[systemd-resolved] action restart[2022-06-30T15:09:11+00:00] INFO: Processing service[systemd-resolved] action restart (aws-parallelcluster-slurm::init_dns line 259) | |
[2022-06-30T15:09:12+00:00] INFO: service[systemd-resolved] restarted | |
- restart service service[systemd-resolved] | |
* hostname[set short hostname] action set[2022-06-30T15:09:12+00:00] INFO: Processing hostname[set short hostname] action set (aws-parallelcluster-slurm::init_dns line 85) | |
* ohai[reload hostname] action nothing[2022-06-30T15:09:12+00:00] INFO: Processing ohai[reload hostname] action nothing (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/hostname.rb line 139) | |
(skipped due to action :nothing) | |
* execute[set hostname to ip-10-0-0-21] action run[2022-06-30T15:09:12+00:00] INFO: Processing execute[set hostname to ip-10-0-0-21] action run (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/hostname.rb line 145) | |
(skipped due to not_if) | |
* file[/etc/hosts] action create[2022-06-30T15:09:12+00:00] INFO: Processing file[/etc/hosts] action create (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/hostname.rb line 106) | |
[2022-06-30T15:09:12+00:00] INFO: file[/etc/hosts] backed up to /etc/chef/local-mode-cache/backup/etc/hosts.chef-20220630150912.089438 | |
[2022-06-30T15:09:12+00:00] INFO: file[/etc/hosts] updated file contents /etc/hosts | |
- update content in file /etc/hosts from aa4ea9 to 9fc279 | |
--- /etc/hosts 2022-05-08 22:10:36.000000000 +0000 | |
+++ /etc/.chef-hosts20220630-1116-co0crc 2022-06-30 15:09:12.086851581 +0000 | |
@@ -7,4 +7,5 @@ | |
ff02::1 ip6-allnodes | |
ff02::2 ip6-allrouters | |
ff02::3 ip6-allhosts | |
+10.0.0.21 ip-10-0-0-21 ip-10-0-0-21 | |
* execute[hostnamectl set-hostname ip-10-0-0-21] action run[2022-06-30T15:09:12+00:00] INFO: Processing execute[hostnamectl set-hostname ip-10-0-0-21] action run (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/hostname.rb line 188) | |
(skipped due to not_if) | |
[2022-06-30T15:09:12+00:00] INFO: file[/etc/hosts] sending reload action to ohai[reload hostname] (delayed) | |
* ohai[reload hostname] action reload[2022-06-30T15:09:12+00:00] INFO: Processing ohai[reload hostname] action reload (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/hostname.rb line 139) | |
[2022-06-30T15:09:12+00:00] INFO: ohai[reload hostname] reloaded | |
- re-run ohai and merge results into node attributes | |
* ohai[reload_hostname] action nothing[2022-06-30T15:09:12+00:00] INFO: Processing ohai[reload_hostname] action nothing (aws-parallelcluster-slurm::init_dns line 91) | |
(skipped due to action :nothing) | |
* replace_or_add[set fqdn in the /etc/hosts] action edit[2022-06-30T15:09:12+00:00] INFO: Processing replace_or_add[set fqdn in the /etc/hosts] action edit (aws-parallelcluster-slurm::init_dns line 97) | |
* file[/etc/hosts] action create[2022-06-30T15:09:12+00:00] INFO: Processing file[/etc/hosts] action create (/etc/chef/local-mode-cache/cache/cookbooks/line/resources/replace_or_add.rb line 41) | |
[2022-06-30T15:09:12+00:00] INFO: file[/etc/hosts] updated file contents /etc/hosts | |
- update content in file /etc/hosts from 9fc279 to 5f5a85 | |
- suppressed sensitive resource | |
Recipe: aws-parallelcluster-config::imds | |
* directory[/opt/parallelcluster/scripts/imds] action create[2022-06-30T15:09:12+00:00] INFO: Processing directory[/opt/parallelcluster/scripts/imds] action create (aws-parallelcluster-config::imds line 23) | |
[2022-06-30T15:09:12+00:00] INFO: directory[/opt/parallelcluster/scripts/imds] created directory /opt/parallelcluster/scripts/imds | |
- create new directory /opt/parallelcluster/scripts/imds[2022-06-30T15:09:12+00:00] INFO: directory[/opt/parallelcluster/scripts/imds] owner changed to 0 | |
[2022-06-30T15:09:12+00:00] INFO: directory[/opt/parallelcluster/scripts/imds] group changed to 0 | |
[2022-06-30T15:09:12+00:00] INFO: directory[/opt/parallelcluster/scripts/imds] mode changed to 744 | |
- change mode from '' to '0744' | |
- change owner from '' to 'root' | |
- change group from '' to 'root' | |
* cookbook_file[/opt/parallelcluster/scripts/imds/imds-access.sh] action create[2022-06-30T15:09:12+00:00] INFO: Processing cookbook_file[/opt/parallelcluster/scripts/imds/imds-access.sh] action create (aws-parallelcluster-config::imds line 32) | |
[2022-06-30T15:09:12+00:00] INFO: cookbook_file[/opt/parallelcluster/scripts/imds/imds-access.sh] created file /opt/parallelcluster/scripts/imds/imds-access.sh | |
- create new file /opt/parallelcluster/scripts/imds/imds-access.sh[2022-06-30T15:09:12+00:00] INFO: cookbook_file[/opt/parallelcluster/scripts/imds/imds-access.sh] updated file contents /opt/parallelcluster/scripts/imds/imds-access.sh | |
- update content in file /opt/parallelcluster/scripts/imds/imds-access.sh from none to 690d14 | |
--- /opt/parallelcluster/scripts/imds/imds-access.sh 2022-06-30 15:09:12.526862416 +0000 | |
+++ /opt/parallelcluster/scripts/imds/.chef-imds-access20220630-1116-yh20ek.sh 2022-06-30 15:09:12.526862416 +0000 | |
@@ -1 +1,163 @@ | |
+#!/bin/bash | |
+set -e | |
+# | |
+# Manage the access to IMDS | |
+# | |
+# --allow <user1,...,userN> List of users to allow access to IMDS | |
+# --deny <user1,...,userN> List of users to deny access to IMDS | |
+# --unset <user1,...,userN> Remove iptables rules related to IMDS for the given list of users | |
+# --flush Restore default IMDS access | |
+# --help Print this help message | |
+ | |
+function error() { | |
+ >&2 echo "[ERROR] $1" | |
+ exit 1 | |
+} | |
+ | |
+function info() { | |
+ echo "[INFO] $1" | |
+} | |
+ | |
+function help() { | |
+ local -- cmd=$(basename "$0") | |
+ cat <<EOF | |
+ | |
+ Usage: ${cmd} [OPTION]... | |
+ | |
+ Manage the access to IMDS | |
+ | |
+ --allow <user1,...,userN> Allow IMDS access to the given list of users | |
+ --deny <user1,...,userN> Deny IMDS access to the given list of users | |
+ --unset <user1,...,userN> Remove iptables rules related to IMDS for the given list of users | |
+ --flush Restore default IMDS access | |
+ --help Print this help message | |
+EOF | |
+} | |
+ | |
+function iptables_delete() { | |
+ local chain=$1 | |
+ local destination=$2 | |
+ local jump=$3 | |
+ local user=$4 | |
+ | |
+ # Build iptables delete command | |
+ if [[ -z $user ]]; then | |
+ rule_args="$chain --destination $destination -j $jump" | |
+ else | |
+ rule_args="$chain --destination $destination -j $jump -m owner --uid-owner $user" | |
+ fi | |
+ | |
+ local iptables_delete_command="iptables -D $rule_args" | |
+ | |
+ # Remove rules | |
+ local should_remove=true | |
+ while $should_remove; do | |
+ eval $iptables_delete_command 1>/dev/null 2>/dev/null || should_remove=false | |
+ done | |
+} | |
+ | |
+function iptables_add() { | |
+ local chain=$1 | |
+ local destination=$2 | |
+ local jump=$3 | |
+ local user=$4 | |
+ | |
+ # Remove duplicate rules | |
+ iptables_delete $chain $destination $jump $user | |
+ | |
+ # Remove opposite rules | |
+ if [[ $jump == "ACCEPT" ]]; then | |
+ iptables_delete $destination "REJECT" $user | |
+ elif [[ $jump == "REJECT" ]]; then | |
+ iptables_delete $destination "ACCEPT" $user | |
+ fi | |
+ | |
+ # Build iptables add command | |
+ if [[ -z $user ]]; then | |
+ rule_args="$chain --destination $destination -j $jump" | |
+ else | |
+ rule_args="$chain --destination $destination -j $jump -m owner --uid-owner $user" | |
+ fi | |
+ | |
+ local iptables_add_command="iptables -A $rule_args" | |
+ | |
+ # Add rule | |
+ eval $iptables_add_command | |
+ info "Rule in chain $chain: $destination $jump $user" | |
+} | |
+ | |
+function setup_chain() { | |
+ local chain=$1 | |
+ local source_chain=$2 | |
+ local destination=$3 | |
+ | |
+ iptables --new $chain 2>/dev/null && info "ParallelCluster chain created: $chain" \ | |
+ || info "ParallelCluster chain exists: $chain" | |
+ | |
+ iptables_add $source_chain $destination $chain | |
+} | |
+ | |
+main() { | |
+ # Constants | |
+ PARALLELCLUSTER_CHAIN="PARALLELCLUSTER_IMDS" | |
+ OUTPUT_CHAIN="OUTPUT" | |
+ IMDS_IP="169.254.169.254" | |
+ | |
+ # Parse options | |
+ while [ $# -gt 0 ] ; do | |
+ case "$1" in | |
+ --allow) allow_users="$2"; shift;; | |
+ --deny) deny_users="$2"; shift;; | |
+ --unset) unset_users="$2"; shift;; | |
+ --flush) flush="true";; | |
+ --help) help; exit 0;; | |
+ *) help; error "Unrecognized option '$1'";; | |
+ esac | |
+ shift | |
+ done | |
+ | |
+ # Check required commands | |
+ command -v iptables >/dev/null || error "Cannot find required command: iptables" | |
+ | |
+ # Check arguments and options | |
+ if [[ -z $allow_users && -z $deny_users && -z $unset_users && -z $flush ]]; then | |
+ error "Missing at least one mandatory option: '--allow', '--deny', '--unset', '--flush'" | |
+ fi | |
+ | |
+ # Setup ParallelCluster chain | |
+ setup_chain $PARALLELCLUSTER_CHAIN $OUTPUT_CHAIN $IMDS_IP | |
+ | |
+ # Flush ParallelCluster chain, if required | |
+ if [[ $flush == "true" ]]; then | |
+ iptables --flush $PARALLELCLUSTER_CHAIN | |
+ info "ParallelCluster chain flushed" | |
+ exit 0 | |
+ fi | |
+ | |
+ # Delete rule: ACCEPT/REJECT user, for every user to unset | |
+ IFS="," | |
+ for user in $unset_users; do | |
+ info "Deleting rules related to IMDS access for user: $user" | |
+ iptables_delete $PARALLELCLUSTER_CHAIN $IMDS_IP "ACCEPT" $user | |
+ iptables_delete $PARALLELCLUSTER_CHAIN $IMDS_IP "REJECT" $user | |
+ done | |
+ | |
+ # Add rule: ACCEPT user, for every allowed user | |
+ for user in $allow_users; do | |
+ info "Allowing IMDS access for user: $user" | |
+ iptables_add $PARALLELCLUSTER_CHAIN $IMDS_IP "ACCEPT" $user | |
+ done | |
+ | |
+ # Add rule: REJECT user, for every denied user | |
+ for user in $deny_users; do | |
+ info "Denying IMDS access for user: $user" | |
+ iptables_add $PARALLELCLUSTER_CHAIN $IMDS_IP "REJECT" $user | |
+ done | |
+ | |
+ # Add rule: REJECT not allowed users | |
+ info "Denying IMDS access for not allowed users" | |
+ iptables_add $PARALLELCLUSTER_CHAIN $IMDS_IP "REJECT" | |
+} | |
+ | |
+main "$@"[2022-06-30T15:09:12+00:00] INFO: cookbook_file[/opt/parallelcluster/scripts/imds/imds-access.sh] owner changed to 0 | |
[2022-06-30T15:09:12+00:00] INFO: cookbook_file[/opt/parallelcluster/scripts/imds/imds-access.sh] group changed to 0 | |
[2022-06-30T15:09:12+00:00] INFO: cookbook_file[/opt/parallelcluster/scripts/imds/imds-access.sh] mode changed to 744 | |
- change mode from '' to '0744' | |
- change owner from '' to 'root' | |
- change group from '' to 'root' | |
* execute[IMDS lockdown enable] action run[2022-06-30T15:09:12+00:00] INFO: Processing execute[IMDS lockdown enable] action run (aws-parallelcluster-config::imds line 41) | |
[execute] [INFO] ParallelCluster chain created: PARALLELCLUSTER_IMDS | |
[INFO] Rule in chain OUTPUT: 169.254.169.254 PARALLELCLUSTER_IMDS | |
[INFO] ParallelCluster chain flushed | |
[INFO] ParallelCluster chain exists: PARALLELCLUSTER_IMDS | |
[INFO] Rule in chain OUTPUT: 169.254.169.254 PARALLELCLUSTER_IMDS | |
[INFO] Allowing IMDS access for user: root | |
[INFO] Rule in chain PARALLELCLUSTER_IMDS: 169.254.169.254 ACCEPT root | |
[INFO] Allowing IMDS access for user: pcluster-admin | |
[INFO] Rule in chain PARALLELCLUSTER_IMDS: 169.254.169.254 ACCEPT pcluster-admin | |
[INFO] Allowing IMDS access for user: ubuntu | |
[INFO] Rule in chain PARALLELCLUSTER_IMDS: 169.254.169.254 ACCEPT ubuntu | |
[INFO] Denying IMDS access for not allowed users | |
[INFO] Rule in chain PARALLELCLUSTER_IMDS: 169.254.169.254 REJECT | |
[2022-06-30T15:09:12+00:00] INFO: execute[IMDS lockdown enable] ran successfully | |
- execute bash /opt/parallelcluster/scripts/imds/imds-access.sh --flush && bash /opt/parallelcluster/scripts/imds/imds-access.sh --allow root,pcluster-admin,ubuntu | |
* execute[Save iptables rules] action run[2022-06-30T15:09:12+00:00] INFO: Processing execute[Save iptables rules] action run (aws-parallelcluster-config::imds line 56) | |
[2022-06-30T15:09:12+00:00] INFO: execute[Save iptables rules] ran successfully | |
- execute mkdir -p $(dirname /etc/parallelcluster/sysconfig/iptables.rules) && iptables-save > /etc/parallelcluster/sysconfig/iptables.rules | |
* template[/etc/init.d/parallelcluster-iptables] action create[2022-06-30T15:09:12+00:00] INFO: Processing template[/etc/init.d/parallelcluster-iptables] action create (aws-parallelcluster-config::imds line 60) | |
[2022-06-30T15:09:12+00:00] INFO: template[/etc/init.d/parallelcluster-iptables] created file /etc/init.d/parallelcluster-iptables | |
- create new file /etc/init.d/parallelcluster-iptables[2022-06-30T15:09:12+00:00] INFO: template[/etc/init.d/parallelcluster-iptables] updated file contents /etc/init.d/parallelcluster-iptables | |
- update content in file /etc/init.d/parallelcluster-iptables from none to d54448 | |
--- /etc/init.d/parallelcluster-iptables 2022-06-30 15:09:12.934872464 +0000 | |
+++ /etc/init.d/.chef-parallelcluster-iptables20220630-1116-zldtbi 2022-06-30 15:09:12.934872464 +0000 | |
@@ -1 +1,46 @@ | |
+#!/bin/bash | |
+# | |
+# parallelcluster-iptables | |
+# | |
+# chkconfig: 12345 99 99 | |
+# description: Backup and restore iptables rules | |
+ | |
+### BEGIN INIT INFO | |
+# Provides: $parallelcluster-iptables | |
+# Required-Start: $network | |
+# Required-Stop: $network | |
+# Default-Start: 1 2 3 4 5 | |
+# Default-Stop: 0 6 | |
+# Short-Description: Backup and restore iptables rules | |
+# Description: Backup and restore iptables rules | |
+### END INIT INFO | |
+ | |
+IPTABLES_RULES_FILE="/etc/parallelcluster/sysconfig/iptables.rules" | |
+ | |
+function start() { | |
+ if [[ -f $IPTABLES_RULES_FILE ]]; then | |
+ iptables-restore < $IPTABLES_RULES_FILE | |
+ echo "iptables rules restored from file: $IPTABLES_RULES_FILE" | |
+ else | |
+ echo "iptables rules left unchanged as file was not found: $IPTABLES_RULES_FILE" | |
+ fi | |
+} | |
+ | |
+function stop() { | |
+ echo "saving iptables rules to file: $IPTABLES_RULES_FILE" | |
+ mkdir -p $(dirname $IPTABLES_RULES_FILE) | |
+ iptables-save > $IPTABLES_RULES_FILE | |
+ echo "iptables rules saved to file: $IPTABLES_RULES_FILE" | |
+} | |
+ | |
+case "$1" in | |
+start|stop) | |
+ $1 | |
+ ;; | |
+*) | |
+ echo "Usage: $0 {start|stop}" | |
+ exit 2 | |
+esac | |
+ | |
+exit $?[2022-06-30T15:09:12+00:00] INFO: template[/etc/init.d/parallelcluster-iptables] owner changed to 0 | |
[2022-06-30T15:09:12+00:00] INFO: template[/etc/init.d/parallelcluster-iptables] group changed to 0 | |
[2022-06-30T15:09:12+00:00] INFO: template[/etc/init.d/parallelcluster-iptables] mode changed to 744 | |
- change mode from '' to '0744' | |
- change owner from '' to 'root' | |
- change group from '' to 'root' | |
* service[parallelcluster-iptables] action enable[2022-06-30T15:09:12+00:00] INFO: Processing service[parallelcluster-iptables] action enable (aws-parallelcluster-config::imds line 68) | |
[2022-06-30T15:09:14+00:00] INFO: service[parallelcluster-iptables] enabled | |
- enable service service[parallelcluster-iptables] | |
* service[parallelcluster-iptables] action start[2022-06-30T15:09:14+00:00] INFO: Processing service[parallelcluster-iptables] action start (aws-parallelcluster-config::imds line 68) | |
[2022-06-30T15:09:14+00:00] INFO: service[parallelcluster-iptables] started | |
- start service service[parallelcluster-iptables] | |
[2022-06-30T15:09:14+00:00] INFO: replace_or_add[set fqdn in the /etc/hosts] sending reload action to ohai[reload_hostname] (delayed) | |
Recipe: aws-parallelcluster-slurm::init_dns | |
* ohai[reload_hostname] action reload[2022-06-30T15:09:14+00:00] INFO: Processing ohai[reload_hostname] action reload (aws-parallelcluster-slurm::init_dns line 91) | |
[2022-06-30T15:09:14+00:00] INFO: ohai[reload_hostname] reloaded | |
- re-run ohai and merge results into node attributes | |
[2022-06-30T15:09:14+00:00] WARN: Skipping final node save because override_runlist was given | |
[2022-06-30T15:09:14+00:00] INFO: Cinc Client Run complete in 28.057885357 seconds | |
[2022-06-30T15:09:14+00:00] INFO: Skipping removal of unused files from the cache | |
Running handlers: | |
[2022-06-30T15:09:14+00:00] INFO: Running report handlers | |
Running handlers complete | |
[2022-06-30T15:09:14+00:00] INFO: Report handlers complete | |
Deprecation warnings that must be addressed before upgrading to Chef Infra 18: | |
The resource in the nfs cookbook should declare `unified_mode true` at 1 location: | |
- /etc/chef/local-mode-cache/cache/cookbooks/nfs/resources/export.rb | |
See https://docs.chef.io/deprecations_unified_mode/ for further details. | |
The pyenv_global resource in the pyenv cookbook should declare `unified_mode true` at 1 location: | |
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/global.rb | |
See https://docs.chef.io/deprecations_unified_mode/ for further details. | |
The pyenv_pip resource in the pyenv cookbook should declare `unified_mode true` at 1 location: | |
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/pip.rb | |
See https://docs.chef.io/deprecations_unified_mode/ for further details. | |
The pyenv_plugin resource in the pyenv cookbook should declare `unified_mode true` at 1 location: | |
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/plugin.rb | |
See https://docs.chef.io/deprecations_unified_mode/ for further details. | |
The pyenv_python resource in the pyenv cookbook should declare `unified_mode true` at 1 location: | |
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/python.rb | |
See https://docs.chef.io/deprecations_unified_mode/ for further details. | |
The pyenv_rehash resource in the pyenv cookbook should declare `unified_mode true` at 1 location: | |
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/rehash.rb | |
See https://docs.chef.io/deprecations_unified_mode/ for further details. | |
The pyenv_script resource in the pyenv cookbook should declare `unified_mode true` at 1 location: | |
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/script.rb | |
See https://docs.chef.io/deprecations_unified_mode/ for further details. | |
The pyenv_system_install resource in the pyenv cookbook should declare `unified_mode true` at 1 location: | |
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/system_install.rb | |
See https://docs.chef.io/deprecations_unified_mode/ for further details. | |
The pyenv_user_install resource in the pyenv cookbook should declare `unified_mode true` at 1 location: | |
- /etc/chef/local-mode-cache/cache/cookbooks/pyenv/resources/user_install.rb | |
See https://docs.chef.io/deprecations_unified_mode/ for further details. | |
The resource in the selinux cookbook should declare `unified_mode true` at 3 locations: | |
- /etc/chef/local-mode-cache/cache/cookbooks/selinux/resources/install.rb | |
- /etc/chef/local-mode-cache/cache/cookbooks/selinux/resources/module.rb | |
- /etc/chef/local-mode-cache/cache/cookbooks/selinux/resources/state.rb | |
See https://docs.chef.io/deprecations_unified_mode/ for further details. | |
The resource in the yum cookbook should declare `unified_mode true` at 1 location: | |
- /etc/chef/local-mode-cache/cache/cookbooks/yum/resources/globalconfig.rb | |
See https://docs.chef.io/deprecations_unified_mode/ for further details. | |
Cinc Client finished, 31/36 resources updated in 32 seconds | |
[2022-06-30T15:09:17+00:00] INFO: Started Cinc Zero at chefzero://localhost:1 with repository at /etc/chef (One version per cookbook) | |
Starting Cinc Client, version 17.2.29 | |
Patents: https://www.chef.io/patents | |
[2022-06-30T15:09:17+00:00] INFO: *** Cinc Client 17.2.29 *** | |
[2022-06-30T15:09:17+00:00] INFO: Platform: x86_64-linux | |
[2022-06-30T15:09:17+00:00] INFO: Cinc-client pid: 1526 | |
[2022-06-30T15:09:20+00:00] WARN: Run List override has been provided. | |
[2022-06-30T15:09:20+00:00] WARN: Original Run List: [] | |
[2022-06-30T15:09:20+00:00] WARN: Overridden Run List: [recipe[aws-parallelcluster::config]] | |
[2022-06-30T15:09:20+00:00] INFO: Run List is [recipe[aws-parallelcluster::config]] | |
[2022-06-30T15:09:20+00:00] INFO: Run List expands to [aws-parallelcluster::config] | |
[2022-06-30T15:09:20+00:00] INFO: Starting Cinc Client Run for ip-10-0-0-21.us-east-2.compute.internal | |
[2022-06-30T15:09:20+00:00] INFO: Running start handlers | |
[2022-06-30T15:09:20+00:00] INFO: Start handlers complete. | |
resolving cookbooks for run list: ["aws-parallelcluster::config"] | |
[2022-06-30T15:09:23+00:00] INFO: Loading cookbooks [aws-parallelcluster@3.1.4, apt@7.4.0, iptables@8.0.0, line@4.0.1, nfs@2.6.4, openssh@2.9.1, pyenv@3.4.2, selinux@3.1.1, yum@6.1.1, yum-epel@4.1.2, aws-parallelcluster-install@3.1.4, aws-parallelcluster-config@3.1.4, aws-parallelcluster-slurm@3.1.4, aws-parallelcluster-scheduler-plugin@3.1.4, aws-parallelcluster-awsbatch@3.1.4, aws-parallelcluster-test@3.1.4] | |
[2022-06-30T15:09:23+00:00] INFO: Skipping removal of obsoleted cookbooks from the cache | |
Synchronizing Cookbooks: | |
- iptables (8.0.0) | |
- aws-parallelcluster (3.1.4) | |
- apt (7.4.0) | |
- line (4.0.1) | |
- yum (6.1.1) | |
- selinux (3.1.1) | |
- nfs (2.6.4) | |
- pyenv (3.4.2) | |
- openssh (2.9.1) | |
- yum-epel (4.1.2) | |
- aws-parallelcluster-install (3.1.4) | |
- aws-parallelcluster-awsbatch (3.1.4) | |
- aws-parallelcluster-config (3.1.4) | |
- aws-parallelcluster-slurm (3.1.4) | |
- aws-parallelcluster-scheduler-plugin (3.1.4) | |
- aws-parallelcluster-test (3.1.4) | |
Installing Cookbook Gems: | |
Compiling Cookbooks... | |
[2022-06-30T15:09:26+00:00] INFO: Detected bootstrap file aws-parallelcluster-cookbook-3.1.4 | |
Converging 68 resources | |
Recipe: aws-parallelcluster::setup_envars | |
* ruby_block[Configure environment variable for recipes context: PATH] action run[2022-06-30T15:09:27+00:00] INFO: Processing ruby_block[Configure environment variable for recipes context: PATH] action run (aws-parallelcluster::setup_envars line 23) | |
[2022-06-30T15:09:27+00:00] INFO: ruby_block[Configure environment variable for recipes context: PATH] called | |
- execute the ruby block Configure environment variable for recipes context: PATH | |
* template[/etc/profile.d/path.sh] action create[2022-06-30T15:09:27+00:00] INFO: Processing template[/etc/profile.d/path.sh] action create (aws-parallelcluster::setup_envars line 32) | |
(up to date) | |
Recipe: aws-parallelcluster-config::openssh | |
* template[/usr/bin/ssh_target_checker.sh] action create[2022-06-30T15:09:27+00:00] INFO: Processing template[/usr/bin/ssh_target_checker.sh] action create (aws-parallelcluster-config::openssh line 19) | |
[2022-06-30T15:09:27+00:00] INFO: template[/usr/bin/ssh_target_checker.sh] created file /usr/bin/ssh_target_checker.sh | |
- create new file /usr/bin/ssh_target_checker.sh[2022-06-30T15:09:27+00:00] INFO: template[/usr/bin/ssh_target_checker.sh] updated file contents /usr/bin/ssh_target_checker.sh | |
- update content in file /usr/bin/ssh_target_checker.sh from none to df73f0 | |
--- /usr/bin/ssh_target_checker.sh 2022-06-30 15:09:27.579230955 +0000 | |
+++ /usr/bin/.chef-ssh_target_checker20220630-1526-43dcui.sh 2022-06-30 15:09:27.571230760 +0000 | |
@@ -1 +1,71 @@ | |
+#!/bin/bash | |
+ | |
+# Copyright 2013-2021 Amazon.com, Inc. or its affiliates. All Rights Reserved. | |
+# | |
+# Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance with the | |
+# License. A copy of the License is located at | |
+# | |
+# http://aws.amazon.com/apache2.0/ | |
+# | |
+# or in the "LICENSE.txt" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES | |
+# OR CONDITIONS OF ANY KIND, express or implied. See the License for the specific language governing permissions and | |
+# limitations under the License. | |
+ | |
+set -o pipefail | |
+ | |
+VPC_CIDR_LIST=(10.0.0.0/16) | |
+ | |
+log() { | |
+ echo "$@" | logger -t "pcluster_ssh_target_checker" | |
+} | |
+ | |
+convert_ip_to_decimal() { | |
+ IFS=./ read -r x y z t mask <<< "${1}" | |
+ echo -n "$((x<<24|y<<16|z<<8|t))" | |
+} | |
+ | |
+convert_mask_to_decimal() { | |
+ IFS=/ read -r _ mask <<< "${1}" | |
+ echo -n "$((-1<<(32-mask)))" | |
+} | |
+ | |
+check_ip_in_cidr() { | |
+ target_address=$(convert_ip_to_decimal "${1}") | |
+ base_address=$(convert_ip_to_decimal "${2}") | |
+ base_mask=$(convert_mask_to_decimal "${2}") | |
+ | |
+ if (( (target_address&base_mask) == (base_address&base_mask) )); then | |
+ return 0 | |
+ fi | |
+ | |
+ return 1 | |
+} | |
+ | |
+target_host=$1 | |
+if [[ -z "${target_host}" ]]; then | |
+ log "No input target host" | |
+ exit 1 | |
+fi | |
+ | |
+if ! resolved_ip=$(getent ahosts "${target_host}" | grep -v : | head -1 | cut -d' ' -f1); then | |
+ log "Cannot resolve target Host ${target_host}" | |
+ exit 1 | |
+fi | |
+ | |
+if [[ "${resolved_ip}" == "127.0.0.1" ]]; then | |
+ # Special case for localhost | |
+ log "Target Host ${target_host} is in VPC CIDR" | |
+ exit 0 | |
+fi | |
+ | |
+for vpc_cidr in "${VPC_CIDR_LIST[@]}" | |
+do | |
+ if check_ip_in_cidr "${resolved_ip}" "${vpc_cidr}"; then | |
+ log "Target Host ${target_host} is in VPC CIDR ${vpc_cidr}" | |
+ exit 0 | |
+ fi | |
+done | |
+ | |
+log "Target Host ${target_host} is not in any VPC CIDR ${vpc_cidr_list[*]}" | |
+exit 1[2022-06-30T15:09:27+00:00] INFO: template[/usr/bin/ssh_target_checker.sh] owner changed to 0 | |
[2022-06-30T15:09:27+00:00] INFO: template[/usr/bin/ssh_target_checker.sh] group changed to 0 | |
[2022-06-30T15:09:27+00:00] INFO: template[/usr/bin/ssh_target_checker.sh] mode changed to 755 | |
- change mode from '' to '0755' | |
- change owner from '' to 'root' | |
- change group from '' to 'root' | |
Recipe: aws-parallelcluster-config::base | |
* sysctl[fs.protected_regular] action apply[2022-06-30T15:09:27+00:00] INFO: Processing sysctl[fs.protected_regular] action apply (aws-parallelcluster-config::base line 23) | |
* directory[/etc/sysctl.d] action create[2022-06-30T15:09:27+00:00] INFO: Processing directory[/etc/sysctl.d] action create (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 139) | |
(up to date) | |
* file[/etc/sysctl.d/99-chef-fs.protected_regular.conf] action create[2022-06-30T15:09:27+00:00] INFO: Processing file[/etc/sysctl.d/99-chef-fs.protected_regular.conf] action create (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 141) | |
[2022-06-30T15:09:27+00:00] INFO: file[/etc/sysctl.d/99-chef-fs.protected_regular.conf] created file /etc/sysctl.d/99-chef-fs.protected_regular.conf | |
- create new file /etc/sysctl.d/99-chef-fs.protected_regular.conf[2022-06-30T15:09:27+00:00] INFO: file[/etc/sysctl.d/99-chef-fs.protected_regular.conf] updated file contents /etc/sysctl.d/99-chef-fs.protected_regular.conf | |
- update content in file /etc/sysctl.d/99-chef-fs.protected_regular.conf from none to e8e418 | |
--- /etc/sysctl.d/99-chef-fs.protected_regular.conf 2022-06-30 15:09:27.707234087 +0000 | |
+++ /etc/sysctl.d/.chef-99-chef-fs20220630-1526-g0pt9p.protected_regular.conf 2022-06-30 15:09:27.707234087 +0000 | |
@@ -1 +1,2 @@ | |
+fs.protected_regular = 0 | |
* execute[Load sysctl values] action run[2022-06-30T15:09:27+00:00] INFO: Processing execute[Load sysctl values] action run (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 145) | |
[2022-06-30T15:09:27+00:00] INFO: execute[Load sysctl values] ran successfully | |
- execute sysctl -p | |
- create fs.protected_regular | |
- set value to "0" | |
- set comment to [] (default value) | |
- set conf_dir to "/etc/sysctl.d" (default value) | |
Recipe: nfs::_common | |
* apt_package[nfs-common] action install[2022-06-30T15:09:27+00:00] INFO: Processing apt_package[nfs-common] action install (nfs::_common line 22) | |
(up to date) | |
* apt_package[rpcbind] action install[2022-06-30T15:09:31+00:00] INFO: Processing apt_package[rpcbind] action install (nfs::_common line 22) | |
(up to date) | |
* directory[/etc/default] action create[2022-06-30T15:09:31+00:00] INFO: Processing directory[/etc/default] action create (nfs::_common line 26) | |
(skipped due to only_if) | |
* template[/etc/default/nfs-common] action create[2022-06-30T15:09:31+00:00] INFO: Processing template[/etc/default/nfs-common] action create (nfs::_common line 36) | |
[2022-06-30T15:09:31+00:00] INFO: template[/etc/default/nfs-common] backed up to /etc/chef/local-mode-cache/backup/etc/default/nfs-common.chef-20220630150931.412023 | |
[2022-06-30T15:09:31+00:00] INFO: template[/etc/default/nfs-common] updated file contents /etc/default/nfs-common | |
- update content in file /etc/default/nfs-common from 1bf5d6 to 89c769 | |
--- /etc/default/nfs-common 2022-05-12 10:22:25.460099899 +0000 | |
+++ /etc/default/.chef-nfs-common20220630-1526-z3yn8h 2022-06-30 15:09:31.407324988 +0000 | |
@@ -1,3 +1,3 @@ | |
-# Generated by Chef for ip-172-31-0-151.ec2.internal# Local modifications will be overwritten. | |
+# Generated by Chef for ip-10-0-0-21.us-east-2.compute.internal# Local modifications will be overwritten. | |
STATDOPTS="--port 32765 --outgoing-port 32766" | |
[2022-06-30T15:09:31+00:00] INFO: template[/etc/default/nfs-common] sending restart action to service[portmap] (immediate) | |
* service[portmap] action restart[2022-06-30T15:09:31+00:00] INFO: Processing service[portmap] action restart (nfs::_common line 46) | |
[2022-06-30T15:09:31+00:00] INFO: service[portmap] restarted | |
- restart service service[portmap] | |
[2022-06-30T15:09:31+00:00] INFO: template[/etc/default/nfs-common] sending restart action to service[lock] (immediate) | |
* service[lock] action restart[2022-06-30T15:09:31+00:00] INFO: Processing service[lock] action restart (nfs::_common line 46) | |
[2022-06-30T15:09:31+00:00] INFO: service[lock] restarted | |
- restart service service[lock] | |
[2022-06-30T15:09:31+00:00] INFO: template[/etc/default/nfs-common] sending restart action to service[nfs-config.service] (immediate) | |
* service[nfs-config.service] action restart[2022-06-30T15:09:31+00:00] INFO: Processing service[nfs-config.service] action restart (nfs::_common line 46) | |
[2022-06-30T15:09:31+00:00] INFO: service[nfs-config.service] restarted | |
- restart service service[nfs-config.service] | |
* template[/etc/modprobe.d/lockd.conf] action create[2022-06-30T15:09:31+00:00] INFO: Processing template[/etc/modprobe.d/lockd.conf] action create (nfs::_common line 36) | |
[2022-06-30T15:09:31+00:00] INFO: template[/etc/modprobe.d/lockd.conf] backed up to /etc/chef/local-mode-cache/backup/etc/modprobe.d/lockd.conf.chef-20220630150931.924946 | |
[2022-06-30T15:09:31+00:00] INFO: template[/etc/modprobe.d/lockd.conf] updated file contents /etc/modprobe.d/lockd.conf | |
- update content in file /etc/modprobe.d/lockd.conf from 2bf649 to 859601 | |
--- /etc/modprobe.d/lockd.conf 2022-05-12 10:22:25.676099974 +0000 | |
+++ /etc/modprobe.d/.chef-lockd20220630-1526-6lzsix.conf 2022-06-30 15:09:31.919337663 +0000 | |
@@ -1,4 +1,4 @@ | |
-# Generated by Chef for ip-172-31-0-151.ec2.internal | |
+# Generated by Chef for ip-10-0-0-21.us-east-2.compute.internal | |
# Local modifications will be overwritten. | |
options lockd nlm_udpport=32768 nlm_tcpport=32768 | |
[2022-06-30T15:09:31+00:00] INFO: template[/etc/modprobe.d/lockd.conf] sending restart action to service[portmap] (immediate) | |
* service[portmap] action restart[2022-06-30T15:09:31+00:00] INFO: Processing service[portmap] action restart (nfs::_common line 46) | |
[2022-06-30T15:09:32+00:00] INFO: service[portmap] restarted | |
- restart service service[portmap] | |
[2022-06-30T15:09:32+00:00] INFO: template[/etc/modprobe.d/lockd.conf] sending restart action to service[lock] (immediate) | |
* service[lock] action restart[2022-06-30T15:09:32+00:00] INFO: Processing service[lock] action restart (nfs::_common line 46) | |
[2022-06-30T15:09:32+00:00] INFO: service[lock] restarted | |
- restart service service[lock] | |
[2022-06-30T15:09:32+00:00] INFO: template[/etc/modprobe.d/lockd.conf] sending restart action to service[nfs-config.service] (immediate) | |
* service[nfs-config.service] action restart[2022-06-30T15:09:32+00:00] INFO: Processing service[nfs-config.service] action restart (nfs::_common line 46) | |
[2022-06-30T15:09:32+00:00] INFO: service[nfs-config.service] restarted | |
- restart service service[nfs-config.service] | |
* service[portmap] action start[2022-06-30T15:09:32+00:00] INFO: Processing service[portmap] action start (nfs::_common line 46) | |
(up to date) | |
* service[portmap] action enable[2022-06-30T15:09:32+00:00] INFO: Processing service[portmap] action enable (nfs::_common line 46) | |
(up to date) | |
* service[lock] action start[2022-06-30T15:09:32+00:00] INFO: Processing service[lock] action start (nfs::_common line 46) | |
(up to date) | |
* service[lock] action enable[2022-06-30T15:09:32+00:00] INFO: Processing service[lock] action enable (nfs::_common line 46) | |
(up to date) | |
* service[nfs-config.service] action start[2022-06-30T15:09:32+00:00] INFO: Processing service[nfs-config.service] action start (nfs::_common line 46) | |
[2022-06-30T15:09:32+00:00] INFO: service[nfs-config.service] started | |
- start service service[nfs-config.service] | |
* service[nfs-config.service] action enable[2022-06-30T15:09:32+00:00] INFO: Processing service[nfs-config.service] action enable (nfs::_common line 46) | |
(up to date) | |
Recipe: nfs::server | |
* apt_package[nfs-kernel-server] action install[2022-06-30T15:09:32+00:00] INFO: Processing apt_package[nfs-kernel-server] action install (nfs::server line 23) | |
(up to date) | |
* template[/etc/default/nfs-kernel-server] action create[2022-06-30T15:09:32+00:00] INFO: Processing template[/etc/default/nfs-kernel-server] action create (nfs::server line 30) | |
[2022-06-30T15:09:32+00:00] INFO: template[/etc/default/nfs-kernel-server] backed up to /etc/chef/local-mode-cache/backup/etc/default/nfs-kernel-server.chef-20220630150932.773271 | |
[2022-06-30T15:09:32+00:00] INFO: template[/etc/default/nfs-kernel-server] updated file contents /etc/default/nfs-kernel-server | |
- update content in file /etc/default/nfs-kernel-server from 5ba45c to 1c890d | |
--- /etc/default/nfs-kernel-server 2022-05-12 10:22:32.092102192 +0000 | |
+++ /etc/default/.chef-nfs-kernel-server20220630-1526-ptjcf8 2022-06-30 15:09:32.767358655 +0000 | |
@@ -1,4 +1,4 @@ | |
-# Generated by Chef for ip-172-31-0-151.ec2.internal# Local modifications will be overwritten. | |
+# Generated by Chef for ip-10-0-0-21.us-east-2.compute.internal# Local modifications will be overwritten. | |
# Rendered Debian/Ubuntu template variant | |
RPCMOUNTDOPTS="-p 32767" | |
RPCNFSDCOUNT="8" | |
* service[nfs-kernel-server] action start[2022-06-30T15:09:32+00:00] INFO: Processing service[nfs-kernel-server] action start (nfs::server line 42) | |
(up to date) | |
* service[nfs-kernel-server] action enable[2022-06-30T15:09:32+00:00] INFO: Processing service[nfs-kernel-server] action enable (nfs::server line 42) | |
(up to date) | |
Recipe: nfs::_idmap | |
* template[/etc/idmapd.conf] action create[2022-06-30T15:09:32+00:00] INFO: Processing template[/etc/idmapd.conf] action create (nfs::_idmap line 23) | |
[2022-06-30T15:09:32+00:00] INFO: template[/etc/idmapd.conf] backed up to /etc/chef/local-mode-cache/backup/etc/idmapd.conf.chef-20220630150932.915870 | |
[2022-06-30T15:09:32+00:00] INFO: template[/etc/idmapd.conf] updated file contents /etc/idmapd.conf | |
- update content in file /etc/idmapd.conf from b0488e to b10c33 | |
--- /etc/idmapd.conf 2021-05-12 19:30:06.000000000 +0000 | |
+++ /etc/.chef-idmapd20220630-1526-9cg9sn.conf 2022-06-30 15:09:32.907362122 +0000 | |
@@ -2,11 +2,35 @@ | |
Verbosity = 0 | |
Pipefs-Directory = /run/rpc_pipefs | |
-# set your own domain here, if it differs from FQDN minus hostname | |
-# Domain = localdomain | |
+# The following should be set to the local NFSv4 domain name | |
+# The default is the host's DNS domain name. | |
+Domain = us-east-2.compute.internal | |
+ | |
+# The following is a comma-separated list of Kerberos realm | |
+# names that should be considered to be equivalent to the | |
+# local realm, such that <user>@REALM.A can be assumed to | |
+# be the same user as <user>@REALM.B | |
+# If not specified, the default local realm is the domain name, | |
+# which defaults to the host's DNS domain name, | |
+# translated to upper-case. | |
+# Note that if this value is specified, the local realm name | |
+# must be included in the list! | |
+#Local-Realms = | |
+ | |
[Mapping] | |
Nobody-User = nobody | |
Nobody-Group = nogroup | |
+ | |
+[Translation] | |
+ | |
+# Translation Method is an comma-separated, ordered list of | |
+# translation methods that can be used. Distributed methods | |
+# include "nsswitch", "umich_ldap", and "static". Each method | |
+# is a dynamically loadable plugin library. | |
+# New methods may be defined and inserted in the list. | |
+# The default is "nsswitch". | |
+Method = nsswitch | |
+ | |
[2022-06-30T15:09:32+00:00] INFO: template[/etc/idmapd.conf] sending restart action to service[idmap] (immediate) | |
* service[idmap] action restart[2022-06-30T15:09:32+00:00] INFO: Processing service[idmap] action restart (nfs::_idmap line 29) | |
[2022-06-30T15:09:33+00:00] INFO: service[idmap] restarted | |
- restart service service[idmap] | |
* service[idmap] action start[2022-06-30T15:09:33+00:00] INFO: Processing service[idmap] action start (nfs::_idmap line 29) | |
(up to date) | |
* service[idmap] action enable[2022-06-30T15:09:33+00:00] INFO: Processing service[idmap] action enable (nfs::_idmap line 29) | |
(up to date) | |
Recipe: aws-parallelcluster-config::nfs | |
* service[nfs-kernel-server] action restart[2022-06-30T15:09:33+00:00] INFO: Processing service[nfs-kernel-server] action restart (aws-parallelcluster-config::nfs line 39) | |
[2022-06-30T15:09:34+00:00] INFO: service[nfs-kernel-server] restarted | |
- restart service service[nfs-kernel-server] | |
Recipe: aws-parallelcluster-config::base | |
* service[setup-ephemeral] action enable[2022-06-30T15:09:34+00:00] INFO: Processing service[setup-ephemeral] action enable (aws-parallelcluster-config::base line 30) | |
[2022-06-30T15:09:35+00:00] INFO: service[setup-ephemeral] enabled | |
- enable service service[setup-ephemeral] | |
* execute[Setup of ephemeral drivers] action run[2022-06-30T15:09:35+00:00] INFO: Processing execute[Setup of ephemeral drivers] action run (aws-parallelcluster-config::base line 37) | |
[execute] ParallelCluster - [INFO] This instance type doesn't have instance store | |
[2022-06-30T15:09:35+00:00] INFO: execute[Setup of ephemeral drivers] ran successfully | |
- execute /usr/local/sbin/setup-ephemeral-drives.sh | |
* sysctl[net.core.somaxconn] action apply[2022-06-30T15:09:35+00:00] INFO: Processing sysctl[net.core.somaxconn] action apply (aws-parallelcluster-config::base line 44) | |
* directory[/etc/sysctl.d] action create[2022-06-30T15:09:36+00:00] INFO: Processing directory[/etc/sysctl.d] action create (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 139) | |
(up to date) | |
* file[/etc/sysctl.d/99-chef-net.core.somaxconn.conf] action create[2022-06-30T15:09:36+00:00] INFO: Processing file[/etc/sysctl.d/99-chef-net.core.somaxconn.conf] action create (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 141) | |
[2022-06-30T15:09:36+00:00] INFO: file[/etc/sysctl.d/99-chef-net.core.somaxconn.conf] created file /etc/sysctl.d/99-chef-net.core.somaxconn.conf | |
- create new file /etc/sysctl.d/99-chef-net.core.somaxconn.conf[2022-06-30T15:09:36+00:00] INFO: file[/etc/sysctl.d/99-chef-net.core.somaxconn.conf] updated file contents /etc/sysctl.d/99-chef-net.core.somaxconn.conf | |
- update content in file /etc/sysctl.d/99-chef-net.core.somaxconn.conf from none to 364f9b | |
--- /etc/sysctl.d/99-chef-net.core.somaxconn.conf 2022-06-30 15:09:36.059440150 +0000 | |
+++ /etc/sysctl.d/.chef-99-chef-net20220630-1526-wi832c.core.somaxconn.conf 2022-06-30 15:09:36.059440150 +0000 | |
@@ -1 +1,2 @@ | |
+net.core.somaxconn = 65535 | |
* execute[Load sysctl values] action run[2022-06-30T15:09:36+00:00] INFO: Processing execute[Load sysctl values] action run (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 145) | |
[2022-06-30T15:09:36+00:00] INFO: execute[Load sysctl values] ran successfully | |
- execute sysctl -p | |
- create net.core.somaxconn | |
- set value to "65535" | |
- set comment to [] (default value) | |
- set conf_dir to "/etc/sysctl.d" (default value) | |
* sysctl[net.ipv4.tcp_max_syn_backlog] action apply[2022-06-30T15:09:36+00:00] INFO: Processing sysctl[net.ipv4.tcp_max_syn_backlog] action apply (aws-parallelcluster-config::base line 48) | |
* directory[/etc/sysctl.d] action create[2022-06-30T15:09:36+00:00] INFO: Processing directory[/etc/sysctl.d] action create (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 139) | |
(up to date) | |
* file[/etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf] action create[2022-06-30T15:09:36+00:00] INFO: Processing file[/etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf] action create (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 141) | |
[2022-06-30T15:09:36+00:00] INFO: file[/etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf] created file /etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf | |
- create new file /etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf[2022-06-30T15:09:36+00:00] INFO: file[/etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf] updated file contents /etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf | |
- update content in file /etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf from none to 97069c | |
--- /etc/sysctl.d/99-chef-net.ipv4.tcp_max_syn_backlog.conf 2022-06-30 15:09:36.215444012 +0000 | |
+++ /etc/sysctl.d/.chef-99-chef-net20220630-1526-90ip3s.ipv4.tcp_max_syn_backlog.conf 2022-06-30 15:09:36.215444012 +0000 | |
@@ -1 +1,2 @@ | |
+net.ipv4.tcp_max_syn_backlog = 65535 | |
* execute[Load sysctl values] action run[2022-06-30T15:09:36+00:00] INFO: Processing execute[Load sysctl values] action run (/opt/cinc/embedded/lib/ruby/gems/3.0.0/gems/chef-17.2.29/lib/chef/resource/sysctl.rb line 145) | |
[2022-06-30T15:09:36+00:00] INFO: execute[Load sysctl values] ran successfully | |
- execute sysctl -p | |
- create net.ipv4.tcp_max_syn_backlog | |
- set value to "65535" | |
- set comment to [] (default value) | |
- set conf_dir to "/etc/sysctl.d" (default value) | |
Recipe: aws-parallelcluster-config::chrony | |
* service[chrony] action enable[2022-06-30T15:09:36+00:00] INFO: Processing service[chrony] action enable (aws-parallelcluster-config::chrony line 18) | |
(up to date) | |
* service[chrony] action start[2022-06-30T15:09:36+00:00] INFO: Processing service[chrony] action start (aws-parallelcluster-config::chrony line 18) | |
(up to date) | |
Recipe: aws-parallelcluster-config::head_node_base | |
* execute[attach_volume_0] action run[2022-06-30T15:09:36+00:00] INFO: Processing execute[attach_volume_0] action run (aws-parallelcluster-config::head_node_base line 42) | |
[execute] Traceback (most recent call last): | |
File "/usr/local/sbin/attachVolume.py", line 152, in <module> | |
main() | |
File "/usr/local/sbin/attachVolume.py", line 130, in main | |
response = ec2.attach_volume(VolumeId=volume_id, InstanceId=instance_id, Device=dev) | |
File "/opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/lib/python3.7/site-packages/botocore/client.py", line 508, in _api_call | |
return self._make_api_call(operation_name, kwargs) | |
File "/opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/lib/python3.7/site-packages/botocore/client.py", line 911, in _make_api_call | |
raise error_class(parsed_response, operation_name) | |
botocore.exceptions.ClientError: An error occurred (VolumeInUse) when calling the AttachVolume operation: vol-0b087667ecac188e4 is already attached to an instance | |
================================================================================ | |
Error executing action `run` on resource 'execute[attach_volume_0]' | |
================================================================================ | |
Mixlib::ShellOut::ShellCommandFailed | |
------------------------------------ | |
Expected process to exit with [0], but received '1' | |
---- Begin output of /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/sbin/attachVolume.py vol-0b087667ecac188e4 ---- | |
STDOUT: | |
STDERR: Traceback (most recent call last): | |
File "/usr/local/sbin/attachVolume.py", line 152, in <module> | |
main() | |
File "/usr/local/sbin/attachVolume.py", line 130, in main | |
response = ec2.attach_volume(VolumeId=volume_id, InstanceId=instance_id, Device=dev) | |
File "/opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/lib/python3.7/site-packages/botocore/client.py", line 508, in _api_call | |
return self._make_api_call(operation_name, kwargs) | |
File "/opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/lib/python3.7/site-packages/botocore/client.py", line 911, in _make_api_call | |
raise error_class(parsed_response, operation_name) | |
botocore.exceptions.ClientError: An error occurred (VolumeInUse) when calling the AttachVolume operation: vol-0b087667ecac188e4 is already attached to an instance | |
---- End output of /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/sbin/attachVolume.py vol-0b087667ecac188e4 ---- | |
Ran /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/sbin/attachVolume.py vol-0b087667ecac188e4 returned 1 | |
Resource Declaration: | |
--------------------- | |
# In /etc/chef/local-mode-cache/cache/cookbooks/aws-parallelcluster-config/recipes/head_node_base.rb | |
42: execute "attach_volume_#{index}" do | |
43: command "#{node.default['cluster']['cookbook_virtualenv_path']}/bin/python /usr/local/sbin/attachVolume.py #{volumeid}" | |
44: creates dev_path[index] | |
45: end | |
46: | |
Compiled Resource: | |
------------------ | |
# Declared in /etc/chef/local-mode-cache/cache/cookbooks/aws-parallelcluster-config/recipes/head_node_base.rb:42:in `block in from_file' | |
execute("attach_volume_0") do | |
action [:run] | |
default_guard_interpreter :execute | |
command "/opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/sbin/attachVolume.py vol-0b087667ecac188e4" | |
declared_type :execute | |
cookbook_name "aws-parallelcluster-config" | |
recipe_name "head_node_base" | |
user nil | |
domain nil | |
creates "/dev/disk/by-ebs-volumeid/vol-0b087667ecac188e4" | |
end | |
System Info: | |
------------ | |
chef_version=17.2.29 | |
platform=ubuntu | |
platform_version=20.04 | |
ruby=ruby 3.0.1p64 (2021-04-05 revision 0fb782ee38) [x86_64-linux] | |
program_name=/bin/cinc-client | |
executable=/opt/cinc/bin/cinc-client | |
[2022-06-30T15:09:37+00:00] INFO: Running queued delayed notifications before re-raising exception | |
[2022-06-30T15:09:37+00:00] INFO: template[/etc/default/nfs-kernel-server] sending restart action to service[nfs-kernel-server] (delayed) | |
Recipe: aws-parallelcluster-config::nfs | |
* service[nfs-kernel-server] action restart[2022-06-30T15:09:37+00:00] INFO: Processing service[nfs-kernel-server] action restart (aws-parallelcluster-config::nfs line 39) | |
[2022-06-30T15:09:39+00:00] INFO: service[nfs-kernel-server] restarted | |
- restart service service[nfs-kernel-server] | |
Running handlers: | |
[2022-06-30T15:09:39+00:00] ERROR: Running exception handlers | |
Running handlers complete | |
[2022-06-30T15:09:39+00:00] ERROR: Exception handlers complete | |
Cinc Client failed. 27 resources updated in 21 seconds | |
[2022-06-30T15:09:39+00:00] FATAL: Stacktrace dumped to /etc/chef/local-mode-cache/cache/cinc-stacktrace.out | |
[2022-06-30T15:09:39+00:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report | |
[2022-06-30T15:09:39+00:00] FATAL: Mixlib::ShellOut::ShellCommandFailed: execute[attach_volume_0] (aws-parallelcluster-config::head_node_base line 42) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1' | |
---- Begin output of /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/sbin/attachVolume.py vol-0b087667ecac188e4 ---- | |
STDOUT: | |
STDERR: Traceback (most recent call last): | |
File "/usr/local/sbin/attachVolume.py", line 152, in <module> | |
main() | |
File "/usr/local/sbin/attachVolume.py", line 130, in main | |
response = ec2.attach_volume(VolumeId=volume_id, InstanceId=instance_id, Device=dev) | |
File "/opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/lib/python3.7/site-packages/botocore/client.py", line 508, in _api_call | |
return self._make_api_call(operation_name, kwargs) | |
File "/opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/lib/python3.7/site-packages/botocore/client.py", line 911, in _make_api_call | |
raise error_class(parsed_response, operation_name) | |
botocore.exceptions.ClientError: An error occurred (VolumeInUse) when calling the AttachVolume operation: vol-0b087667ecac188e4 is already attached to an instance | |
---- End output of /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/sbin/attachVolume.py vol-0b087667ecac188e4 ---- | |
Ran /opt/parallelcluster/pyenv/versions/3.7.10/envs/cookbook_virtualenv/bin/python /usr/local/sbin/attachVolume.py vol-0b087667ecac188e4 returned 1 | |
ubuntu@ip-10-0-0-21:/var/log$ | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment