Skip to content

Instantly share code, notes, and snippets.

@rhb2
Last active January 5, 2021 23:58
Show Gist options
  • Save rhb2/0e7766ebac276ffe7e08be7a0b776508 to your computer and use it in GitHub Desktop.
Save rhb2/0e7766ebac276ffe7e08be7a0b776508 to your computer and use it in GitHub Desktop.
Scavenger Hunt

From the headnode, login to the rebalancer zone

[rbogart@headnode (ap-northeast-1a) ~]$ manta-login rebalancer
0:   rebalancer        1 90ed0873-1c3a-4a25-84fe-53cb0a5e5443 10.92.72.42     
1:   rebalancer        1 d76fdeca-75ed-4451-8046-eda32382789c 10.92.64.42     
Choose a number: 0
[Connected to zone '90ed0873-1c3a-4a25-84fe-53cb0a5e5443' pts/8]
Last login: Tue Jan  5 19:52:23 on pts/6
 =  J O Y E N T  =

    mantav2-rebalancer (master-20201013T203847Z-g7702692)
    https://github.com/joyent/manta-rebalancer.git
    triton-origin-x86_64-19.4.0@master-20200130T200825Z-gbb45b8d

Open the rebalancer-manager postgres database

[root@90ed0873 (rebalancer) ~]$ psql -U postgres rebalancer
psql (12.4)
Type "help" for help.

rebalancer=# \dt
        List of relations
 Schema | Name | Type  |  Owner   
--------+------+-------+----------
 public | jobs | table | postgres
(1 row)

Look for information on an evacuation job that you care about to verify that it's there

rebalancer=# select * from jobs where id = '6395ae3a-1be2-48ef-8b33-f193c3446cda';
                  id                  |  action  |  state   
--------------------------------------+----------+----------
 6395ae3a-1be2-48ef-8b33-f193c3446cda | evacuate | complete
(1 row)

Exit psql and open the database named after the evacuation job id

[root@90ed0873 (rebalancer) ~]$ psql -U postgres 6395ae3a-1be2-48ef-8b33-f193c3446cda
psql (12.4)
Type "help" for help.

6395ae3a-1be2-48ef-8b33-f193c3446cda=# 

View tables in the database for this job

[root@90ed0873 (rebalancer) ~]$ psql -U postgres 6395ae3a-1be2-48ef-8b33-f193c3446cda
psql (12.4)
Type "help" for help.

6395ae3a-1be2-48ef-8b33-f193c3446cda=# \d
              List of relations
 Schema |      Name       | Type  |  Owner   
--------+-----------------+-------+----------
 public | config          | table | postgres
 public | duplicates      | table | postgres
 public | evacuateobjects | table | postgres
(3 rows)

Describe the evacuateobjects table

6395ae3a-1be2-48ef-8b33-f193c3446cda=# \d evacuateobjects
              Table "public.evacuateobjects"
     Column     |  Type   | Collation | Nullable | Default 
----------------+---------+-----------+----------+---------
 id             | text    |           | not null | 
 assignment_id  | text    |           |          | 
 object         | jsonb   |           |          | 
 shard          | integer |           |          | 
 dest_shark     | text    |           |          | 
 etag           | text    |           |          | 
 status         | text    |           | not null | 
 skipped_reason | text    |           |          | 
 error          | text    |           |          | 
Indexes:
    "evacuateobjects_pkey" PRIMARY KEY, btree (id)
    "assignment_id" btree (assignment_id)

Output truncated here to avoid confusion.

Look for all objects (for example) that have a skipped_reason of 'source_other_error'

6395ae3a-1be2-48ef-8b33-f193c3446cda=# select * from evacuateobjects where skipped_reason = 'source_other_error';

Outut below truncated to avoid confusion:

|     8 | 911.stor.ap-northeast.scloud.host | D543764D | skipped | source_other_error | 
 0cb7a2ba-5567-6df9-d8f5-b989bebc6bf4 | 99439118-33ac-46fc-b6bd-7b79206347bf | {"key": "/930896af-bf8c-48d4-885c-6573a94b1853/stor/logs/ap-northeast-1a/vminfod/2019/11/22/07/MS713754.log", "etag": "0cb7a2ba-5567-6df9-d8f5-b989bebc6bf4", "name": "MS713754.log", "type": "object", "mtime": 1574410246020, "owner": "930896af-bf8c-48d4-885c-6573a94b1853", "roles": [], "vnode": 54881, "sharks": [{"datacenter": "ap-northeast-1b", "manta_storage_id": "1016.stor.ap-northeast.scloud.host"}, {"datacenter": "ap-northeast-1c", "manta_storage_id": "331.stor.ap-northeast.scloud.host"}], "creator": "930896af-bf8c-48d4-885c-6573a94b1853", "dirname": "/930896af-bf8c-48d4-885c-6573a94b1853/stor/logs/ap-northeast-1a/vminfod/2019/11/22/07", "headers": {}, "objectId": "0cb7a2ba-5567-6df9-d8f5-b989bebc6bf4", "contentMD5": "eVupmjr+GoBeAXs6wv7bNg==", "contentType": "text/plain", "contentLength": 208232}                   

From this row, we can see a few things that we need:

  1. The name of the system running the agent: 911.stor.ap-northeast.scloud.host
  2. The assignment id: 99439118-33ac-46fc-b6bd-7b79206347bf
  3. The name of the system being evacuated: 1016.stor.ap-northeast.scloud.host
  4. The name of the system that the rebalancer agent is attempting to download the copy of the object from: 331.stor.ap-northeast.scloud.host
  5. The name of the object: 0cb7a2ba-5567-6df9-d8f5-b989bebc6bf4
  6. The name of the account: 930896af-bf8c-48d4-885c-6573a94b1853

Login to the system running the agent of interest. From the headnode (make sure you have the right DC):

[rbogart@headnode (ap-northeast-1a) ~]$ manta-adm show -o storage_id,zonename storage |grep 911
911.stor.ap-northeast.scloud.host 6ad9ba31-4abf-42c7-9cb5-851d91774ef5

Now login to that zone running the agent

[rbogart@headnode (ap-northeast-1a) ~]$ manta-login 6ad9ba31-4abf-42c7-9cb5-851d91774ef5
[Connected to zone '6ad9ba31-4abf-42c7-9cb5-851d91774ef5' pts/2]
Last login: Tue Jan  5 21:08:07 on pts/2
 =  J O Y E N T  =

    mantav2-storage (master-20200430T161240Z-g555e7ac)
    https://github.com/joyent/manta-mako.git
    triton-origin-x86_64-19.4.0@master-20200130T200825Z-gbb45b8d

[root@6ad9ba31 (storage) ~]$ 

If you would like to look at the assignment that the errant object was a part of:

[root@6ad9ba31 (storage) cd /var/tmp/rebalancer/completed
sqlite3 99439118-33ac-46fc-b6bd-7b79206347bf
sqlite> .tables
stats  tasks
sqlite> select * from tasks;
a018bf2f-c1b5-4c03-932b-ddecc63dfb00|e9c7b8f7-f51a-43ed-a159-f109da7d3162|XE6n8nxthNs3zB7PlO8w+w==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
2e2de31f-92bc-6e27-cbeb-84b4942d5820|e9c7b8f7-f51a-43ed-a159-f109da7d3162|ZMSeig4gNyf8fVkeFqfJBQ==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
0a35237c-9d2e-cf02-ed17-fb1d5c5934ab|e9c7b8f7-f51a-43ed-a159-f109da7d3162|kuQa90Krw3DIDZwktzwLrw==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
fce45ed9-b0e6-e4fc-abe8-eb1c47e799a1|e9c7b8f7-f51a-43ed-a159-f109da7d3162|G/ErcWDWKNgFtdDoIS9gkA==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
602ebdc3-1b9c-e92e-eb29-e519a8081bff|e9c7b8f7-f51a-43ed-a159-f109da7d3162|nMTzbaIIHXw7Mr74PPiitw==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
1084b552-6283-e2ad-c326-945c8a060396|e9c7b8f7-f51a-43ed-a159-f109da7d3162|buxlNDmh6dp40yK52Udqtw==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
008422c7-2337-e463-f566-8fc450958bc8|e9c7b8f7-f51a-43ed-a159-f109da7d3162|XtKZI9w4x0QTtiJaXmyLWA==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
2ea077d6-eab4-6219-9964-952f2f32287c|e9c7b8f7-f51a-43ed-a159-f109da7d3162|rSKje9PPVieQiSXZWBIQgA==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
8fb0e95e-d958-cb4f-dc23-9a99ad079e90|e9c7b8f7-f51a-43ed-a159-f109da7d3162|vKj0JQxwCQyOx7Zf2VBXKw==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
8557904f-59d8-4ee1-b88b-d46d10bb936a|e9c7b8f7-f51a-43ed-a159-f109da7d3162|Ai38MuE45Qf+A5POv8Pypg==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
8c0db1d8-fc8e-686d-fd5a-d1778afffdb6|e9c7b8f7-f51a-43ed-a159-f109da7d3162|e0GN1A6mTObhqj6wDBp8IQ==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
1af96185-b7ed-ce6b-ebc4-de56d08f68c1|e9c7b8f7-f51a-43ed-a159-f109da7d3162|5wrFEc5eFiPEHomMVfyFXQ==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
53ab763a-082b-e0e0-a82c-e32956680cc5|e9c7b8f7-f51a-43ed-a159-f109da7d3162|ASIirgllNnQWzSl88uBlVg==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
d05a94a2-3d75-cc6a-e5f2-f2f30c46c505|14e7e2f4-6fc5-4eff-9590-5081afd47479|y4lxv7zw6Mo7qr9K20cZZA==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
f7328875-d608-4b17-f68a-fe011184dd14|e9c7b8f7-f51a-43ed-a159-f109da7d3162|OVa01kCEIsly1oRuxv4eOg==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
b39cdbaa-0912-41d1-b9b1-93a7a5d8d276|8a70155f-752f-eb91-a755-f4edbee5c99f|qqt+DSOiI3JOErsjpWDryA==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
ca7220bb-65cf-67aa-f9cf-88689895336e|e9c7b8f7-f51a-43ed-a159-f109da7d3162|7W/rBQsM20pjSa3Cnx9J4g==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
1e7c58dd-86d6-c82d-b238-87971ecca293|e9c7b8f7-f51a-43ed-a159-f109da7d3162|NVDg0jmvFp7o4RluPAjarA==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}
ad9b3ef6-4c1c-4bab-904c-d5ed05713b19|e9c7b8f7-f51a-43ed-a159-f109da7d3162|n7nP/RSl+kwzFgD6Ey3NEg==|ap-northeast-1c|331.stor.ap-northeast.scloud.host|{"Failed":"SourceOtherError"}

If you would like to repro a problem with an object by hand, one way to do it would be:

[root@6ad9ba31 (storage) /var/tmp/rebalancer/completed]$ export TARGET=331.stor.ap-northeast.scloud.host
[root@6ad9ba31 (storage) /var/tmp/rebalancer/completed]$ export ACCOUNT=930896af-bf8c-48d4-885c-6573a94b1853
[root@6ad9ba31 (storage) /var/tmp/rebalancer/completed]$ export OBJECT=0cb7a2ba-5567-6df9-d8f5-b989bebc6bf4
[root@6ad9ba31 (storage) /var/tmp/rebalancer/completed]$ curl http://${TARGET}/${ACCOUNT}/${OBJECT}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment