Skip to content

Instantly share code, notes, and snippets.

@mffiedler
Last active August 30, 2017 19:00
Show Gist options
  • Save mffiedler/8a4fa369fa29b6182f86f23f60cedc13 to your computer and use it in GitHub Desktop.
Save mffiedler/8a4fa369fa29b6182f86f23f60cedc13 to your computer and use it in GitHub Desktop.
Logging information
Clean up logging
oc project logging
oc delete dc --all --cascade=true
oc delete ds --all
oc delete route --all
oc delete svc --all
oc delete pvc --all
oc delete configmap --all
oc delete sa aggregated-logging-mux aggregated-logging-kibana aggregated-logging-fluentd aggregated-logging-elasticsearch aggregated-logging-curator
oc delete secret logging-curator logging-elasticsearch logging-fluentd logging-kibana logging-kibana-proxy logging-mux
# set use_mux variables to true for logging-mux bug
# increase elastic cluster size bu adding openshift_logging_es_cluster_size
[oo_first_master]
ec2-54-202-158-94.us-west-2.compute.amazonaws.com
[oo_first_master:vars]
openshift_deployment_type=openshift-enterprise
openshift_release=v3.6.0
openshift_logging_install_logging=true
openshift_logging_use_ops=false
openshift_logging_master_url=https://ec2-54-202-158-94.us-west-2.compute.amazonaws.com:8443
openshift_logging_master_public_url=https://ec2-54-202-158-94.us-west-2.compute.amazonaws.com:8443
openshift_logging_kibana_hostname=kibana.0712-6sq.qe.rhcloud.com
openshift_logging_kibana_ops_hostname=kibana-ops.0712-6sq.qe.rhcloud.com
openshift_logging_namespace=logging
openshift_logging_image_prefix=registry.ops.openshift.com/openshift3/
openshift_logging_image_version=v3.6.140
openshift_logging_es_pvc_dynamic=true
openshift_logging_es_pvc_size=50Gi
openshift_logging_es_cluster_size=3
openshift_logging_fluentd_use_journal=true
openshift_logging_use_mux=false
openshift_logging_use_mux_client=false
openshift_logging_es_ops_allow_cluster_reader=true
openshift_logging_es_ops_pvc_dynamic=true
openshift_logging_es_ops_pvc_size=50Gi
Fluentd 200 node issue
Create 200 node cluster with 3 infra nodes
Deploy logging. Use openshift_logging_es_cluster_size=3
Verify 1 fluentd running on each node
Check fluentd logs for errors
If needed, run a logging pod on each node (https://github.com/openshift/svt/blob/master/openshift_scalability/content/logtest/ocp_logtest-README.md)
For logging-mux issue:
30+ node cluster
Deploy logging - use_mux vars=true, 1 es should be fine
1 logging-mux replica on infra
On infra, run oc label node <node-name> --overwrite logging-infra-fluentd=false (we want the logging-mux to be the only fluentd running there)
Verify fluentd pods are connecting to the logging-mux pod
oc get svc and check ip address of logging-mux
oc exec <any fluentd pod> -- ss -tnpi
Look at destination IPs, should be the loggin-mux svc
Go to the infra node and ps -ef | grep fluentd. There will be a parent and child. Get the pid of the child
pidstat -p <pid> 10
When the problem happens, fluentd will be at 100%. It does not always trigger. If it does not
try oc scale --replicas=0 dc/logging-mux and back to 1 a couple of times
start some logging pods (above) and let them run a while
2nd way to attempt to trigger:
oc label nodes -l region=primary --overwrite logging-infra-fluentd=false
wait for pods to stop
oc label nodes -l region=primary --overwrite logging-infra-fluentd=true
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment