brson/tiup.md Secret

## tiup.md

      
    Raw
  

              tiup.md
            
          
    First impressions of tiup

tiup is a new version manager for TiDB.

repository: https://github.com/pingcap-incubator/tiup
website: https://tiup.io/

Building

The README.md file doesn't include build instructions. tiup is built in Go. I don't know Go but
I type go build and hope that works. It does.
The website and installation

Let's follow the instructions at https://tiup.io/ and see what happens.
The TiDB link on the website links to the GitHub repo. Seems like it might
want to link to the TiDB product website (which is the same as the PingCAP
website).
$ curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 6027k  100 6027k    0     0   873k      0  0:00:06  0:00:06 --:--:-- 1442k
Detected shell: /bin/bash
Shell profile:  /home/ubuntu/.bash_profile
/home/ubuntu/.bash_profile has been modified to to add tiup to PATH
open a new terminal or source /home/ubuntu/.bash_profile to use it
Installed path: /home/ubuntu/.tiup/bin/tiup
===============================================
Have a try:     tiup playground
===============================================

Running bash --login produces a working tiup environment.
Running tiup playground downloads playground, pd, tikv, and tidb
and starts the cluster.
Error message with bad grammar:

"The component playground doesn't installed, download from repository"

From the output it looks like running playground itself triggers the installation
of pd, tikv, and tidb. That is, tiup prints

Starting /home/ubuntu/.tiup/components/playground/v0.0.5/playground

Then begins downloading the remaining components. Makes me wonder what playground is,
and if playground itself is running tiup to install components.
What does tiup do to .bashprofile?

export PATH=/home/ubuntu/.tiup/bin:$PATH

The running cluster

tiup playground ends its output with:
CLUSTER START SUCCESSFULLY, Enjoy it ^-^
To connect TiDB: mysql --host 127.0.0.1 --port 4000 -u root
To view the dashboard: http://127.0.0.1:2379/dashboard

I can connect with mysql and I see an empty test database.
I want to view the dashboard but I'm running this on an EC2 instance, not
locally. Can I change the IP address?
From the output of tiup playground --help it seems like no. I can only use
127.0.0.1.
I see the --monitor option and restart playground with it. Prometheus is
downloaded and now the output looks like
CLUSTER START SUCCESSFULLY, Enjoy it ^-^
To connect TiDB: mysql --host 127.0.0.1 --port 4000 -u root
To view the dashboard: http://127.0.0.1:2379/dashboard
To view the monitor: http://127.0.0.1:9090

What's in the .tiup directory?

The data directory is interesting. It countains two directories with
semi-random names. I wonder if I get a new one every time I run playground.
Seems like it, and that I need to run tiup clean --all to delete them.
Would be more foolproof for tiup to deal with this automatically.
tiup commands

The tiup CLI treats components as subcommands, which is confusing. That is,
in tiup playground, playground is not a command it is the name of a component.
What if there need to be non-executable components someday? Some components I
try to run exit with errors.
This part of the tiup help output is awkward:
  # *HOW TO* reuse instance data instead of generating a new data directory each time?
  # The instances which have the same "TAG" will share the data directory: $TIUP_HOME/data/$TAG.
  $ tiup --tag mycluster playground

The manifests

The .tiup/manifest directory contains tiup's metadata.
The root manifest seems to be tiup-manifest.index. It's pretty simple:
{
  "description": "TiUP supported components list",
  "modified": "2020-02-27T15:20:35+08:00",
  "tiup_version": "v0.0.1",
  "components": [
    {
      "name": "tidb",
      "desc": "TiDB is an open source distributed HTAP database compatible with the MySQL protocol",
      "platforms": [
        "darwin/amd64",
        "linux/amd64"
      ]
    },

...

There's a tiup version number there so the manifest format can potentially be
incompatibly upgraded. Does tiup check that version number before attempting to
interpret it? No, I don't see that the version number is used for anything.
These manifest files should probably contain a format version number that can be
incremented to indicate backwards-incompatible changes, and tiup should verify
that it is compatible before using them.
I don't see in this index any indication of "channels" though tiup supports
updating to either the "latest" or "nightly". From the description in tiup help
I think tiup only supports a single installation at a time, not many concurrent
installations like other version managers.
It looks like tiup uses a custom manifest format. I would recommend just using
https://theupdateframework.io/. At the very least, read their specs carefully.
There are a number of mistakes everybody makes writing software updates.
The "nightlyness" of a component is embedded in the component manifest itself.
The way these manifests are organized seems wrong to me. There's no concept
of a release here that contains a full set of components. Instead every component
is versioned independently. You either get the latest version of every component,
or you pick the exact version of each individual component.
As the TiDB distribution grows this seems like it will be miserable for
end-users. How do I get all components that were distributed as part of TiDB
3.0.0? Yeah, today it might be the case that the major components of TiDB all
have the same version numbers, so tiup could pick matching ones, but that's not
true of, e.g. prometheus.
The metadata appears to have no concept of a release of the whole system, just
releases of individual components.
Updating components

tiup update --all is silent. I guess it worked? Seems like status output would
be more reassuring.
Reused URLs and file drift

I see that nightly components being downloaded are at fixed URLs:

https://tiup-mirrors.pingcap.com/pd-nightly-linux-amd64.tar.gz

Fixed URLs like this caused rustup huge problems and I would recommend against it.
The problem is that be reusing the same URL for binaries that change over time you run into
problems where the binary is not the one you think it is.
This is one of the biggest problems we had with rustup. Some of the problems are
detailed in https://internals.rust-lang.org/t/future-updates-to-the-rustup-distribution-format/4196
E.g. imagine somebody is installing all the nightly components at the same time a new nightly
is being deployed, and they end up getting some components from one nightly and other components
from another. Similar problems can happen with a CDN where some endpoints have one binary
and others have another. Similar problems happen if you try to verify the hash of a component
that changes over time - sometimes that binary will not be the one you expect and hashsums will
fail.
From the manifests it looks like this is how the "master" version is treated as
well. The tiup binary appears to have the same treatment - just one file in one
location that is overwritten and redownloaded.
I recommend tiup not try to download anything except the root manifest from any
location that is regularly overwritten. Doing so will likely cause users weird
intermittent failures.
In general you want to upload artifacts to unique locations and never modify
them ever. Modifying deployed files leads to all kinds of problems. Ideally
there is one "root" index that changes atomically, and every other resource the
update system uses can be found from that root, hashed and signed and immutable
forever.
It will make your life much easier in the long term.
I don't see any hashes anywhere in the manifests. Embedding hashes in the
manifests lets you be sure that after download you have is the one you are
supposed to have. It at least lets the tool detect if it has encountered the
types of problems caused by overwriting files on the CDN.
Everything should be hashed and verified.
Ok, from reading the tiup source code I see that there are .sha1 files next to
the files being download. This is exactly what I did in the first version of
rustup and it was a mistake.
Threading hashes throughout the manifests creates a merkle tree that lets
everything be verified correctly from that one root index.
The way the component manifests are designed now they must be continually
mutated as they contain the versions of every component.
A more reliable structure would be to have every release of the entire system
get its own manifest that contains the URLs and hashes of every component.
That release manifest is uploaded once and never touched.
The root index contains a list of all the release manifests and is the only
file that is ever mutated.
Note that verifying the integrity and security of the root manifest is fairly
complex. Again, you don't want to have a .sha1 file sitting next to the root
manifest because it will sometimes not agree with the contents of the manifest.
No URLs in manifests

The manifests themselves don't contain the URLs of components. Instead the URLs
are synthesized from the metadata according to rules within tiup.
This seems like a good idea until the rules for locating a tarball need to
change, or e.g. tarballs are switched to some other file format. After that
happens then tiup needs to encode two sets of rules and will need a mechanism
for deciding which URL-generating rules to apply.
Self updates

Running tiup update --self produces no output. Producing no output on success
is a valid design, but it's not reassuring for humans.
A few more notes

Another thing I notice is that various commands lazily download components as needed.
This is a nice convenience, and probably the right  default, but is terrible for automation. Anytime the tool might hit the network there needs to be some way to split the command in two, the first one doing the networking and installation, and the second running the tool locally while guaranteeing it doesn't hit the network.
So e.g. I would want a command like tiup playground --no-install
It looks like self-updates are done even if there is no updated binary - that tiup is just a file sitting at a known location that is blindly redownloaded and installed on request. tiup doesn't know if it needs an update - it just always updates.