Skip to content

Instantly share code, notes, and snippets.

[ RUN ] ContentType/AgentAPIStreamingTest.AttachInputToNestedContainerSession/1
I0515 19:41:23.545804 25362 cluster.cpp:160] Creating default 'local' authorizer
I0515 19:41:23.547175 25367 master.cpp:383] Master b6915b70-1be0-4916-903d-86326dbeda9d (ee944315b12c) started on 172.17.0.2:48299
I0515 19:41:23.547215 25367 master.cpp:385] Flags at startup: --acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticators="crammd5" --authorizers="local" --credentials="/tmp/0NvXy8/credentials" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_framework_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="
@bmahler
bmahler / upgrades.md
Created March 30, 2017 01:36
Adding MULTI_ROLE to 1.3.x upgrade documentation
title layout
Apache Mesos - Upgrading Mesos
documentation

Upgrading Mesos

This document serves as a guide for users who wish to upgrade an existing Mesos cluster. Some versions require particular upgrade techniques when upgrading a running cluster. Some upgrades will have incompatible changes.

Overview

5974305.082012] CPU: 7 PID: 22518 Comm: kworker/7:1 Tainted: P W IOE 4.2.2-coreos-r2 #2
[5974305.092045] Hardware name: Intel Corporation S2600TP/S2600TP, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[5974305.104102] Workqueue: cgroup_destroy css_free_work_fn
[5974305.110256] ffffffff817e530b ffff8828c4127c98 ffffffff81525d98 ffff883ffee30f38
[5974305.119161] ffff8828c4127ce8 ffff8828c4127cd8 ffffffff8106942a ffff882d4de8b340
[5974305.128059] ffffffff81b009c8 ffffffff81b009c8 ffff882d4de8b000 ffff882d4de88c00
[5974305.136946] Call Trace:
[5974305.140067] [<ffffffff81525d98>] dump_stack+0x45/0x57
[5974305.146198] [<ffffffff8106942a>] warn_slowpath_common+0x8a/0xc0
[5974305.153302] [<ffffffff810694a6>] warn_slowpath_fmt+0x46/0x50
[----------] 18 tests from LinuxFilesystemIsolatorTest 431470 ms total)
[----------] 24 tests from DockerContainerizerTest 343497 ms total)
[----------] 30 tests from SlaveRecoveryTest/0 241867 ms total)
[----------] 3 tests from DockerRuntimeIsolatorTest 116041 ms total)
[----------] 8 tests from DockerTest 70494 ms total)
[----------] 4 tests from ProvisionerDockerPullerTest 64443 ms total)
[----------] 22 tests from DiskResource/PersistentVolumeTest 40507 ms total)
[----------] 4 tests from MesosContainerizerSlaveRecoveryTest 36853 ms total)
[----------] 46 tests from SlaveTest 36828 ms total)
[----------] 8 tests from HealthCheckTest 35899 ms total)
@bmahler
bmahler / ev.c
Created October 9, 2015 18:51
Sleeps injected into ev.c
/*
* libev event processing core, watcher management
*
* Copyright (c) 2007,2008,2009,2010,2011,2012,2013 Marc Alexander Lehmann <libev@schmorp.de>
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without modifica-
* tion, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice,
@bmahler
bmahler / gist:d9c5ab9ab30124ffa8d9
Last active August 29, 2015 14:18
Contributor's Guide

Contributor's Guide

If you are making your first contributions, please review the instructions for making a contribution.

This document is an attempt to capture a shared set of values, practices, and learnings. Even though a lot of this may seem obvious, there is value in establishing a more formal reference: to come to an agreed upon set of values, to help new contributors ramp-up in the project, to foster discussion, etc.

Engineering Principles and Practices

Many companies rely on Mesos as a foundational layer of their software infrastructure and it is imperative that we ship high quality, robust code. We aim to foster a culture where we can trust and rely upon the work of the community.

@bmahler
bmahler / operational-guide.md
Last active August 29, 2015 14:13
Operational Guide

Operational Guide

Changing the master quorum

Currently the master leverages a paxos-based replicated log as its storage backend (--registry=replicated_log is the only storage backend supported). Each master participates in the ensemble as a log replica. The --quorum flag determines a majority of the masters.

The following table shows the tolerance to master failures, for each quorum size:

Masters Quorum Size Failure Tolerance
1 1 0
I1030 19:23:53.686470 1592 master.cpp:3349] Performing explicit task state reconciliation for 12 tasks of framework Singularity
I1030 19:23:53.696506 1592 master.cpp:3349] Performing explicit task state reconciliation for 12 tasks of framework Singularity
I1030 19:23:53.706125 1592 master.cpp:3349] Performing explicit task state reconciliation for 12 tasks of framework Singularity
I1030 19:23:53.715679 1592 master.cpp:3349] Performing explicit task state reconciliation for 12 tasks of framework Singularity
I1030 19:23:53.726356 1592 master.cpp:3349] Performing explicit task state reconciliation for 11 tasks of framework Singularity
I1030 19:23:53.735402 1592 master.cpp:3349] Performing explicit task state reconciliation for 11 tasks of framework Singularity
I1030 19:23:53.744413 1592 master.cpp:3349] Performing explicit task state reconciliation for 11 tasks of framework Singularity
I1030 19:23:53.753397 1592 master.cpp:3349] Performing explicit task state reconciliation for 11 tasks of framework Sing
$ grep ci-tagupdate-stryker.2014.10.21T22.57.44-1414434875917-1-10-us_west_2c scheduler_log
DEBUG [2014-10-30 19:01:27,762] com.hubspot.singularity.scheduler.SingularityNewTaskChecker: Got task state UNHEALTHY_KILL_TASK for task ci-tagupdate-stryker.2014.10.21T22.57.44-1414434875917-1-10-us_west_2c in 00:00.004
DEBUG [2014-10-30 19:01:52,303] com.hubspot.singularity.scheduler.SingularityCleaner: Killing a task SingularityTaskCleanup [user=Optional.absent(), cleanupType=UNHEALTHY_NEW_TASK, timestamp=1414695687762, taskId=ci-tagupdate-stryker.2014.10.21T22.57.44-1414434875917-1-10-us_west_2c] immediately because of its cleanup type
DEBUG [2014-10-30 19:01:52,304] com.hubspot.singularity.scheduler.SingularityCleaner: TaskCleanup SingularityTaskCleanup [user=Optional.absent(), cleanupType=UNHEALTHY_NEW_TASK, timestamp=1414695687762, taskId=ci-tagupdate-stryker.2014.10.21T22.57.44-1414434875917-1-10-us_west_2c] had LB state NOT_LOAD_BALANCED after 00:00.000
INFO [2014-10-30 19:01:52,304] com.hubspot.singularity.m
@bmahler
bmahler / yosemite
Created October 18, 2014 22:28
Yosemite
[ RUN ] VersionTest.Parse
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::bad_lexical_cast> >'
what(): bad lexical cast: source type value could not be interpreted as target
*** Aborted at 1413669921 (unix time) try "date -d @1413669921" if you are using GNU date ***
PC: @ 0x7fff9570e282 __pthread_kill
*** SIGABRT (@0x7fff9570e282) received by PID 51428 (TID 0x7fff78786300) stack trace: ***
@ 0x7fff91106f1a _sigtramp
@ 0x100aef30f (anonymous namespace)::get_safe_base_mutex()::safe_base_mutex
@ 0x7fff9543ab73 abort
@ 0x100a203ab __gnu_cxx::__verbose_terminate_handler()