Skip to content

Instantly share code, notes, and snippets.

@bflad
Created June 26, 2013 01:22
Show Gist options
  • Save bflad/5863991 to your computer and use it in GitHub Desktop.
Save bflad/5863991 to your computer and use it in GitHub Desktop.
Skyline Analyzer DictProxy/KeyError
redis-cli
> flushall
> exit
service skyline-analyzer start
tail -f /var/log/skyline/analyzer.log
started with pid 24099
2013-06-25 20:20:25 :: starting skyline analyzer
2013-06-25 20:20:25 :: seconds to run :: 0.10
2013-06-25 20:20:25 :: total metrics :: 3793
2013-06-25 20:20:25 :: total analyzed :: -2
2013-06-25 20:20:25 :: total anomalies :: 0
2013-06-25 20:20:25 :: exception stats :: {'Incomplete': 3795}
2013-06-25 20:20:25 :: anomaly breakdown :: {}
2013-06-25 20:20:26 :: sleeping due to low run time...
2013-06-25 20:20:36 :: seconds to run :: 0.10
2013-06-25 20:20:36 :: total metrics :: 3793
2013-06-25 20:20:36 :: total analyzed :: -2
2013-06-25 20:20:36 :: total anomalies :: 0
2013-06-25 20:20:36 :: exception stats :: {'Incomplete': 3795}
2013-06-25 20:20:36 :: anomaly breakdown :: {}
2013-06-25 20:20:36 :: sleeping due to low run time...
2013-06-25 20:20:46 :: seconds to run :: 0.15
2013-06-25 20:20:46 :: total metrics :: 3794
2013-06-25 20:20:46 :: total analyzed :: -1
2013-06-25 20:20:46 :: total anomalies :: 0
2013-06-25 20:20:46 :: exception stats :: {'Incomplete': 3795}
2013-06-25 20:20:46 :: anomaly breakdown :: {}
2013-06-25 20:20:46 :: sleeping due to low run time...
2013-06-25 20:20:56 :: seconds to run :: 0.11
2013-06-25 20:20:56 :: total metrics :: 3794
2013-06-25 20:20:56 :: total analyzed :: -1
2013-06-25 20:20:56 :: total anomalies :: 0
2013-06-25 20:20:56 :: exception stats :: {'Incomplete': 3795}
2013-06-25 20:20:56 :: anomaly breakdown :: <DictProxy object, typeid 'dict' at 0x2aec3d0; '__str__()' failed>
2013-06-25 20:20:56 :: sleeping due to low run time...
back (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 216, in serve_client
obj, exposed, gettypeid = id_to_obj[ident]
KeyError: '92f330'
---------------------------------------------------------------------------
Process Process-35:
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in _bootstrap
self.run()
File "/usr/lib64/python2.6/multiprocessing/process.py", line 88, in run
self._target(*self._args, **self._kwargs)
File "/opt/skyline/src/analyzer/analyzer.py", line 118, in spin_process
if key not in self.exceptions:
File "<string>", line 2, in __contains__
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, in _callmethod
raise convert_to_error(kind, result)
RemoteError:
---------------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 216, in serve_client
obj, exposed, gettypeid = id_to_obj[ident]
KeyError: '92f330'
---------------------------------------------------------------------------
Process Process-34:
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in _bootstrap
self.run()
File "/usr/lib64/python2.6/multiprocessing/process.py", line 88, in run
self._target(*self._args, **self._kwargs)
File "/opt/skyline/src/analyzer/analyzer.py", line 118, in spin_process
if key not in self.exceptions:
File "<string>", line 2, in __contains__
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, in _callmethod
raise convert_to_error(kind, result)
RemoteError:
---------------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 216, in serve_client
obj, exposed, gettypeid = id_to_obj[ident]
KeyError: '92f330'
---------------------------------------------------------------------------
Process Process-36:
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in _bootstrap
self.run()
File "/usr/lib64/python2.6/multiprocessing/process.py", line 88, in run
self._target(*self._args, **self._kwargs)
File "/opt/skyline/src/analyzer/analyzer.py", line 118, in spin_process
if key not in self.exceptions:
File "<string>", line 2, in __contains__
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, in _callmethod
raise convert_to_error(kind, result)
RemoteError:
---------------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 216, in serve_client
obj, exposed, gettypeid = id_to_obj[ident]
KeyError: '92f330'
---------------------------------------------------------------------------
2013-06-25 20:21:06 :: seconds to run :: 0.11
2013-06-25 20:21:06 :: total metrics :: 3794
ading.py", line 532, in __bootstrap_inner
self.run()
File "/opt/skyline/src/analyzer/analyzer.py", line 173, in run
logger.info('total analyzed :: %d' % (len(unique_metrics) - sum(self.exceptions.values())))
File "<string>", line 2, in values
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, in _callmethod
raise convert_to_error(kind, result)
RemoteError:
---------------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 216, in serve_client
obj, exposed, gettypeid = id_to_obj[ident]
KeyError: '92f330'
---------------------------------------------------------------------------
rm /var/run/skyline/analyzer.pid # due to separate bug
service skyline-analyzer start
tail -f /var/log/skyline/analyzer.log
started with pid 11436
2013-06-25 20:36:28 :: starting skyline analyzer
2013-06-25 20:36:28 :: seconds to run :: 0.14
2013-06-25 20:36:28 :: total metrics :: 3795
2013-06-25 20:36:28 :: total analyzed :: 0
2013-06-25 20:36:28 :: total anomalies :: 0
2013-06-25 20:36:28 :: exception stats :: {'Incomplete': 3795}
2013-06-25 20:36:28 :: anomaly breakdown :: {}
2013-06-25 20:36:29 :: sleeping due to low run time...
2013-06-25 20:36:39 :: seconds to run :: 0.10
2013-06-25 20:36:39 :: total metrics :: 3795
2013-06-25 20:36:39 :: total analyzed :: 0
2013-06-25 20:36:39 :: total anomalies :: 0
2013-06-25 20:36:39 :: exception stats :: {'Incomplete': 3795}
2013-06-25 20:36:39 :: anomaly breakdown :: {}
2013-06-25 20:36:39 :: sleeping due to low run time...
2013-06-25 20:36:49 :: seconds to run :: 0.10
2013-06-25 20:36:49 :: total metrics :: 3795
2013-06-25 20:36:49 :: total analyzed :: 0
2013-06-25 20:36:49 :: total anomalies :: 0
2013-06-25 20:36:49 :: exception stats :: {'Incomplete': 3795}
2013-06-25 20:36:49 :: anomaly breakdown :: {}
2013-06-25 20:36:49 :: sleeping due to low run time...
2013-06-25 20:36:59 :: seconds to run :: 0.10
2013-06-25 20:36:59 :: total metrics :: 3795
2013-06-25 20:36:59 :: total analyzed :: 0
2013-06-25 20:36:59 :: total anomalies :: 0
2013-06-25 20:36:59 :: exception stats :: {'Incomplete': 3795}
2013-06-25 20:36:59 :: anomaly breakdown :: <DictProxy object, typeid 'dict' at 0x2efa450; '__str__()' failed>
2013-06-25 20:36:59 :: sleeping due to low run time...
2013-06-25 20:37:10 :: seconds to run :: 0.10
2013-06-25 20:37:10 :: total metrics :: 3795
2013-06-25 20:37:10 :: total analyzed :: 0
2013-06-25 20:37:10 :: total anomalies :: 0
2013-06-25 20:37:10 :: exception stats :: {'Incomplete': 3795}
2013-06-25 20:37:10 :: anomaly breakdown :: <DictProxy object, typeid 'dict' at 0x2efa090; '__str__()' failed>
2013-06-25 20:37:10 :: sleeping due to low run time...
2013-06-25 20:37:20 :: seconds to run :: 0.13
2013-06-25 20:37:20 :: total metrics :: 3795
2013-06-25 20:37:20 :: total analyzed :: 0
2013-06-25 20:37:20 :: total anomalies :: 0
2013-06-25 20:37:20 :: exception stats :: {'Incomplete': 3795}
2013-06-25 20:37:20 :: anomaly breakdown :: {}
2013-06-25 20:37:20 :: sleeping due to low run time...
2013-06-25 20:37:30 :: seconds to run :: 0.11
2013-06-25 20:37:30 :: total metrics :: 3795
2013-06-25 20:37:30 :: total analyzed :: 0
2013-06-25 20:37:30 :: total anomalies :: 0
2013-06-25 20:37:30 :: exception stats :: {'Incomplete': 3795}
2013-06-25 20:37:30 :: anomaly breakdown :: <DictProxy object, typeid 'dict' at 0x2efab90; '__str__()' failed>
2013-06-25 20:37:30 :: sleeping due to low run time...
t recent call last):
File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in _bootstrap
self.run()
File "/usr/lib64/python2.6/multiprocessing/process.py", line 88, in run
self._target(*self._args, **self._kwargs)
File "/opt/skyline/src/analyzer/analyzer.py", line 118, in spin_process
if key not in self.exceptions:
File "<string>", line 2, in __contains__
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, in _callmethod
raise convert_to_error(kind, result)
RemoteError:
---------------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 216, in serve_client
obj, exposed, gettypeid = id_to_obj[ident]
KeyError: '2f30fe0'
---------------------------------------------------------------------------
ptions:
File "<string>", line 2, in __contains__
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, in _callmethod
raise convert_to_error(kind, result)
RemoteError:
---------------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 216, in serve_client
obj, exposed, gettypeid = id_to_obj[ident]
KeyError: '2f30fe0'
---------------------------------------------------------------------------
2013-06-25 20:37:40 :: seconds to run :: 0.11
2013-06-25 20:37:40 :: total metrics :: 3795
ading.py", line 532, in __bootstrap_inner
self.run()
File "/opt/skyline/src/analyzer/analyzer.py", line 173, in run
logger.info('total analyzed :: %d' % (len(unique_metrics) - sum(self.exceptions.values())))
File "<string>", line 2, in values
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, in _callmethod
raise convert_to_error(kind, result)
RemoteError:
---------------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 216, in serve_client
obj, exposed, gettypeid = id_to_obj[ident]
KeyError: '2f30fe0'
---------------------------------------------------------------------------
python --version
2.6.6
pip freeze
Flask==0.10.1
Jinja2==2.7
M2Crypto==0.20.2
MarkupSafe==0.18
PyXML==0.8.4
Werkzeug==0.9.1
distribute==0.6.45
ethtool==0.6
hiredis==0.1.1
iniparse==0.3.1
ipython==0.13.2
itsdangerous==0.21
lockfile==0.9.1
matplotlib==1.2.1
msgpack-python==0.3.0
nose==1.3.0
numpy==1.7.1
pandas==0.11.0
patsy==0.1.0
pyOpenSSL==0.10
pycurl==7.19.0
pygpgme==0.1
python-daemon==1.6
python-dateutil==1.4.1
python-dmidecode==3.10.13
pytz==2013b
redis==2.7.6
rhsm==1.1.8
scipy==0.12.0
simplejson==2.0.9
statsmodels==0.5.0
sympy==0.7.2
urlgrabber==3.9.1
yum-metadata-parser==1.1.2
redis-cli info
# Server
redis_version:2.6.11
redis_git_sha1:00000000
redis_git_dirty:0
redis_mode:standalone
os:Linux 2.6.32-358.11.1.el6.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:4.4.7
process_id:22341
run_id:081121bb78fde7698c085c8a791c222cad92a917
tcp_port:6379
uptime_in_seconds:426781
uptime_in_days:4
hz:10
lru_clock:906081
# Clients
connected_clients:6
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
# Memory
used_memory:12441160
used_memory_human:11.86M
used_memory_rss:21323776
used_memory_peak:207273224
used_memory_peak_human:197.67M
used_memory_lua:31744
mem_fragmentation_ratio:1.71
mem_allocator:jemalloc-3.2.0
# Persistence
loading:0
rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1372209608
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:0
rdb_current_bgsave_time_sec:-1
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
# Stats
total_connections_received:55650
total_commands_processed:1117456432
instantaneous_ops_per_sec:1
rejected_connections:0
expired_keys:0
evicted_keys:0
keyspace_hits:204581940
keyspace_misses:2461
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:1569
# Replication
role:master
connected_slaves:0
# CPU
used_cpu_sys:42442.49
used_cpu_user:12624.47
used_cpu_sys_children:327.57
used_cpu_user_children:1947.51
# Keyspace
db0:keys=7612,expires=0
"""
This file dynamically written by Chef on --OMITTED--
Any changes will be automatically overwritten.
"""
"""
Shared settings
"""
# The path for the Redis unix socket
REDIS_SOCKET_PATH='/tmp/redis.sock'
# The Skyline logs directory. Do not include a trailing slash.
LOG_PATH = '/var/log/skyline'
# The Skyline pids directory. Do not include a trailing slash.
PID_PATH = '/var/run/skyline'
# Metrics will be prefixed with this value in Redis.
FULL_NAMESPACE = 'metrics.'
# The Horizon agent will make T'd writes to both the full namespace and the
# mini namespace. Oculus gets its data from everything in the mini namespace.
MINI_NAMESPACE = 'mini.'
# This is the rolling duration that will be stored in Redis. Be sure to pick a
# value that suits your memory capacity, your CPU capacity, and your overall
# metrics count. Longer durations take a longer to analyze, but they can
# help the algorithms reduce the noise and provide more accurate anomaly
# detection.
FULL_DURATION = 86400
# This is the duration of the 'mini' namespace, if you are also using the
# Oculus service. It is also the duration of data that is displayed in the
# web app 'mini' view.
MINI_DURATION = 3600
# If you have a Graphite host set up, set this metric to get graphs on
# Skyline and Horizon. Include http://.
GRAPHITE_HOST = 'http://--OMITTED--'
# If you have Oculus set up, set this metric to set the clickthrough
# on the webapp. Include http://.
OCULUS_HOST = 'http://--OMITTED--'
"""
Analyzer settings
"""
# This is the location the Skyline agent will write the anomalies file to disk.
# It needs to be in a location accessible to the webapp.
ANOMALY_DUMP = 'webapp/static/dump/anomalies.json'
# This is the number of processes that the Skyline analyzer will spawn.
# Analysis is a very CPU-intensive procedure. You will see optimal results
# if you set ANALYZER_PROCESSES to several less than the total number of
# CPUs on your box. Be sure to leave some CPU room for the Horizon workers,
# and for Redis.
ANALYZER_PROCESSES = 5
# This is the duration, in seconds, for a metric to become 'stale' and for
# the analyzer to ignore it until new datapoints are added. 'Staleness' means
# that a datapoint has not been added for STALE_PERIOD seconds.
STALE_PERIOD = 500
# This is the minimum length of a timeseries, in datapoints, for the analyzer
# to recognize it as a complete series.
MIN_TOLERABLE_LENGTH = 1
# Sometimes a metric will continually transmit the same number. There's no need
# to analyze metrics that remain boring like this, so this setting determines
# the amount of boring datapoints that will be allowed to accumulate before the
# analyzer skips over the metric. If the metric becomes noisy again, the
# analyzer will stop ignoring it.
MAX_TOLERABLE_BOREDOM = 100
# The canary metric should be a metric with a very high, reliable resolution
# that you can use to gauge the status of the system as a whole.
CANARY_METRIC = 'statsd.numStats'
# These are the algorithms that the Analyzer will run. To add a new algorithm,
# you must both define the algorithm in algorithms.py and add its name here.
ALGORITHMS = ["first_hour_average", "mean_subtraction_cumulation", "simple_stddev_from_moving_average", "stddev_from_moving_average", "least_squares", "grubbs", "histogram_bins"]
# This is the number of algorithms that must return True before a metric is
# classified as anomalous.
CONSENSUS = 5
"""
Horizon settings
"""
# This is the number of worker processes that will consume from the Horizon
# queue.
WORKER_PROCESSES = 2
# This is the port that listens for Graphite pickles over TCP, sent by Graphite's
# carbon-relay agent.
PICKLE_PORT = 2024
# This is the port that listens for Messagepack-encoded UDP packets.
UDP_PORT = 2025
# This is how big a 'chunk' of metrics will be before they are added onto
# the shared queue for processing into Redis. If you are noticing that Horizon
# is having trouble consuming metrics, try setting this value a higher.
CHUNK_SIZE = 10
# This is the maximum allowable length of the processing queue before new
# chunks are prevented from being added. If you consistently fill up the
# processing queue, a higher MAX_QUEUE_SIZE will not save you. It most likely
# means that the workers do not have enough CPU alotted in order to process the
# queue on time. Try increasing CHUNK_SIZE, decreasing ANALYZER_PROCESSES, or
# decreasing ROOMBA_PROCESSES.
MAX_QUEUE_SIZE = 500
# This is the number of Roomba processes that will be spawned to trim
# timeseries in order to keep them at FULL_DURATION. Keep this number small,
# as it is not important that metrics be exactly FULL_DURATION *all* the time.
ROOMBA_PROCESSES = 1
# The Horizon agent will ignore incoming datapoints if their timestamp
# is older than MAX_RESOLUTION seconds ago.
MAX_RESOLUTION = 1000
# These are metrics that, for whatever reason, you do not want to store
# in Skyline. The Listener will check to see if each incoming metrics
# contains anything in the skip list. It is generally wise to skip entire
# namespaces by adding a '.' at the end of the skipped item - otherwise
# you might skip things you don't intend to.
SKIP_LIST = []
"""
Webapp settings
"""
# The IP address for the webapp
WEBAPP_IP = '0.0.0.0'
# The port for the webapp
WEBAPP_PORT = 1500
cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.4 (Santiago)
uname -a
Linux XXXX 2.6.32-358.11.1.el6.x86_64 #1 SMP Wed May 15 10:48:38 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment