Skip to content

Instantly share code, notes, and snippets.

@brson

brson/tiup.md Secret

Last active March 26, 2020 00:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save brson/d37b115b5968b2dcec9d3c3eadc1f644 to your computer and use it in GitHub Desktop.
Save brson/d37b115b5968b2dcec9d3c3eadc1f644 to your computer and use it in GitHub Desktop.
tiup-first-impressions.md

First impressions of tiup

tiup is a new version manager for TiDB.

Building

The README.md file doesn't include build instructions. tiup is built in Go. I don't know Go but I type go build and hope that works. It does.

The website and installation

Let's follow the instructions at https://tiup.io/ and see what happens.

The TiDB link on the website links to the GitHub repo. Seems like it might want to link to the TiDB product website (which is the same as the PingCAP website).

$ curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 6027k  100 6027k    0     0   873k      0  0:00:06  0:00:06 --:--:-- 1442k
Detected shell: /bin/bash
Shell profile:  /home/ubuntu/.bash_profile
/home/ubuntu/.bash_profile has been modified to to add tiup to PATH
open a new terminal or source /home/ubuntu/.bash_profile to use it
Installed path: /home/ubuntu/.tiup/bin/tiup
===============================================
Have a try:     tiup playground
===============================================

Running bash --login produces a working tiup environment.

Running tiup playground downloads playground, pd, tikv, and tidb and starts the cluster.

Error message with bad grammar:

"The component playground doesn't installed, download from repository"

From the output it looks like running playground itself triggers the installation of pd, tikv, and tidb. That is, tiup prints

Starting /home/ubuntu/.tiup/components/playground/v0.0.5/playground

Then begins downloading the remaining components. Makes me wonder what playground is, and if playground itself is running tiup to install components.

What does tiup do to .bashprofile?

export PATH=/home/ubuntu/.tiup/bin:$PATH

The running cluster

tiup playground ends its output with:

CLUSTER START SUCCESSFULLY, Enjoy it ^-^
To connect TiDB: mysql --host 127.0.0.1 --port 4000 -u root
To view the dashboard: http://127.0.0.1:2379/dashboard

I can connect with mysql and I see an empty test database.

I want to view the dashboard but I'm running this on an EC2 instance, not locally. Can I change the IP address?

From the output of tiup playground --help it seems like no. I can only use 127.0.0.1.

I see the --monitor option and restart playground with it. Prometheus is downloaded and now the output looks like

CLUSTER START SUCCESSFULLY, Enjoy it ^-^
To connect TiDB: mysql --host 127.0.0.1 --port 4000 -u root
To view the dashboard: http://127.0.0.1:2379/dashboard
To view the monitor: http://127.0.0.1:9090

What's in the .tiup directory?

The data directory is interesting. It countains two directories with semi-random names. I wonder if I get a new one every time I run playground. Seems like it, and that I need to run tiup clean --all to delete them.

Would be more foolproof for tiup to deal with this automatically.

tiup commands

The tiup CLI treats components as subcommands, which is confusing. That is, in tiup playground, playground is not a command it is the name of a component.

What if there need to be non-executable components someday? Some components I try to run exit with errors.

This part of the tiup help output is awkward:

  # *HOW TO* reuse instance data instead of generating a new data directory each time?
  # The instances which have the same "TAG" will share the data directory: $TIUP_HOME/data/$TAG.
  $ tiup --tag mycluster playground

The manifests

The .tiup/manifest directory contains tiup's metadata.

The root manifest seems to be tiup-manifest.index. It's pretty simple:

{
  "description": "TiUP supported components list",
  "modified": "2020-02-27T15:20:35+08:00",
  "tiup_version": "v0.0.1",
  "components": [
    {
      "name": "tidb",
      "desc": "TiDB is an open source distributed HTAP database compatible with the MySQL protocol",
      "platforms": [
        "darwin/amd64",
        "linux/amd64"
      ]
    },

...

There's a tiup version number there so the manifest format can potentially be incompatibly upgraded. Does tiup check that version number before attempting to interpret it? No, I don't see that the version number is used for anything. These manifest files should probably contain a format version number that can be incremented to indicate backwards-incompatible changes, and tiup should verify that it is compatible before using them.

I don't see in this index any indication of "channels" though tiup supports updating to either the "latest" or "nightly". From the description in tiup help I think tiup only supports a single installation at a time, not many concurrent installations like other version managers.

It looks like tiup uses a custom manifest format. I would recommend just using https://theupdateframework.io/. At the very least, read their specs carefully. There are a number of mistakes everybody makes writing software updates.

The "nightlyness" of a component is embedded in the component manifest itself.

The way these manifests are organized seems wrong to me. There's no concept of a release here that contains a full set of components. Instead every component is versioned independently. You either get the latest version of every component, or you pick the exact version of each individual component.

As the TiDB distribution grows this seems like it will be miserable for end-users. How do I get all components that were distributed as part of TiDB 3.0.0? Yeah, today it might be the case that the major components of TiDB all have the same version numbers, so tiup could pick matching ones, but that's not true of, e.g. prometheus.

The metadata appears to have no concept of a release of the whole system, just releases of individual components.

Updating components

tiup update --all is silent. I guess it worked? Seems like status output would be more reassuring.

Reused URLs and file drift

I see that nightly components being downloaded are at fixed URLs:

https://tiup-mirrors.pingcap.com/pd-nightly-linux-amd64.tar.gz

Fixed URLs like this caused rustup huge problems and I would recommend against it. The problem is that be reusing the same URL for binaries that change over time you run into problems where the binary is not the one you think it is.

This is one of the biggest problems we had with rustup. Some of the problems are detailed in https://internals.rust-lang.org/t/future-updates-to-the-rustup-distribution-format/4196

E.g. imagine somebody is installing all the nightly components at the same time a new nightly is being deployed, and they end up getting some components from one nightly and other components from another. Similar problems can happen with a CDN where some endpoints have one binary and others have another. Similar problems happen if you try to verify the hash of a component that changes over time - sometimes that binary will not be the one you expect and hashsums will fail.

From the manifests it looks like this is how the "master" version is treated as well. The tiup binary appears to have the same treatment - just one file in one location that is overwritten and redownloaded.

I recommend tiup not try to download anything except the root manifest from any location that is regularly overwritten. Doing so will likely cause users weird intermittent failures.

In general you want to upload artifacts to unique locations and never modify them ever. Modifying deployed files leads to all kinds of problems. Ideally there is one "root" index that changes atomically, and every other resource the update system uses can be found from that root, hashed and signed and immutable forever.

It will make your life much easier in the long term.

I don't see any hashes anywhere in the manifests. Embedding hashes in the manifests lets you be sure that after download you have is the one you are supposed to have. It at least lets the tool detect if it has encountered the types of problems caused by overwriting files on the CDN.

Everything should be hashed and verified.

Ok, from reading the tiup source code I see that there are .sha1 files next to the files being download. This is exactly what I did in the first version of rustup and it was a mistake.

Threading hashes throughout the manifests creates a merkle tree that lets everything be verified correctly from that one root index.

The way the component manifests are designed now they must be continually mutated as they contain the versions of every component.

A more reliable structure would be to have every release of the entire system get its own manifest that contains the URLs and hashes of every component. That release manifest is uploaded once and never touched.

The root index contains a list of all the release manifests and is the only file that is ever mutated.

Note that verifying the integrity and security of the root manifest is fairly complex. Again, you don't want to have a .sha1 file sitting next to the root manifest because it will sometimes not agree with the contents of the manifest.

No URLs in manifests

The manifests themselves don't contain the URLs of components. Instead the URLs are synthesized from the metadata according to rules within tiup.

This seems like a good idea until the rules for locating a tarball need to change, or e.g. tarballs are switched to some other file format. After that happens then tiup needs to encode two sets of rules and will need a mechanism for deciding which URL-generating rules to apply.

Self updates

Running tiup update --self produces no output. Producing no output on success is a valid design, but it's not reassuring for humans.

A few more notes

Another thing I notice is that various commands lazily download components as needed.

This is a nice convenience, and probably the right default, but is terrible for automation. Anytime the tool might hit the network there needs to be some way to split the command in two, the first one doing the networking and installation, and the second running the tool locally while guaranteeing it doesn't hit the network.

So e.g. I would want a command like tiup playground --no-install

It looks like self-updates are done even if there is no updated binary - that tiup is just a file sitting at a known location that is blindly redownloaded and installed on request. tiup doesn't know if it needs an update - it just always updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment