Proposal:
-
The schema version becomes an abridged semver:
[major].[minor]
. A major version change means something in code needs to be changed to support the new data (to the extent that we think it matters — we could decide that something wasn't a big deal if slightly misformatted). Minor version changes are safe to apply — they might add a field to something, but not in a way that's expected to cause any problems. -
A given client version comes bundled with a given schema version, and it can fetch minor schema updates for its major version, on daily checks or when it sees a
Zotero-Schema-Version
header when starting a sync. This would be client-based, not server-based, so for the daily check we'd make schema versions available as/schema/4
or something (which might serve version4.2
). If a sync returnedZotero-Schema-Version
and it was a higher major version, the client would ignore it. -
When uploading data to a library, the client would include its current schema version in a
Zotero-Schema-Version
request header, and the server would store the version with each library if it was a known schema version (to avoid mischief) and greater than the currently stored version for that library. -
When starting a sync for a library, the client would check the
Zotero-Library-Schema-Version
header, which would be the stored version for that library, meaning it's the highest-possible version for data in that library. If it was higher than the client's current schema version, the client would stop syncing that library and say that syncing that library required a newer version of the app. (If it was only a higher minor version, it might mean that the minor schema update (fromZotero-Schema-Version
at the start of the main sync process) failed for some reason, and it could just show a temporary error rather than saying the client needs to be updated.) -
Unknown properties shouldn't ever happen under this scheme, so they would cause the object download to fail. The items would be retried on a backoff schedule (in case there was a server-side problem) or after an upgrade (in case there was a client bug), as they already are now (at least in the desktop app).
Issues:
-
Just because a client with a given schema version writes to a library, it doesn't necessarily mean the data is incompatible, but we have no good way of knowing that, so it requires a sync cut-off. (I think it would be crazy for the API to start comparing the data to all past schema versions, for example.)
-
This cut-off would happen even when we added new object types (e.g., annotations) that an older client wouldn't try to download anyway because it didn't know about them. (I think this would, in fact, mean that there was no reason to track library versions separately for different object types, as Michal said he was doing, becuase the client wouldn't even try to sync the library if it didn't support the new object type.)
-
This only partly solves the beta problem. It means that we can make a new major version available on the server and also bundle it with a beta, which is necessary for testing new sync-dependent features (good), but if the beta writes to a library, no non-beta clients will be able to sync with that library (bad).
-
We'd still want as much as possible in the schema, to minimize major versions. So as Michal says, item type image URLs (of various sizes) should be in there, and we'd want to think about other things that might help avoid major versions.
Bonus Proposal:
-
The best way to keep the cut-off from affecting too many people would be to roll out app versions that could support a new major schema version but that didn't expose the associated functionality in the UI until they were offered the new major schema version from the API. That would let us remotely turn on features after most users had upgraded to a compatible version. Unfortunately, the semver approach on its own prevents that, because it means the client, rather than the server, decides which clients to send a new major version to. (Doing it server-side also wouldn't be very nice to unofficial clients.)
-
A hybrid approach could be to do semver but also set a maximum major version in the client that it could upgrade to if available, and hide features until the major version was offered. So if the client had schema
2.4
but it had amaxSchemaVersion
of3
, it would check/schema/3
before checking/schema/2
, and only use/2
if/3
was a 404. Similarly,Zotero-Schema-Version
from the API would offer a comma-separated list of the latest available version for each major version, and if the client with2.4
andmaxSchemaVersion
of3
saw that a3.2
was available, it could upgrade to that and expose the hidden functionality. -
We would test this by dropping in a
3.2
schema file locally and/or by adding3.2
to apidev responses. -
It's a little weird to turn on functionality remotely — and it does increase the chances of a bug that suddenly appeared even though someone hadn't upgraded (perhaps purposely) — but I think it'd be the best way to minimize sync cut-offs.
-
I'm not sure if Apple has some app store rule against enabling new functionality like this.
The whole idea of ever getting a message that says you need to upgrade to sync is sort of unpleasant, and a major departure from our historical practice (where we didn't cut off anything for many years and then cut off 4.0 only after 5.0 had been out for about a year), but I don't see a better option, and this last part would at least keep most regular upgraders from seeing such a message, at least when we went to the trouble of adding forward-compatibility.
@dstillman the proposal looks good, some comments to some issues:
Issues:
The bad part is kind of expected and needs to be communicated to beta testers clearly before they apply to beta. The problem could be even simpler than our schema problems. Beta version can have some big changes to DB (new objects, troublesome migration) so if you went from beta back to normal app the schema version of internal DB would be lower and the DB would just keep returning errors and nothing would work. So if beta testers suddenly decided they want to quit for some reason, they would have to remove the app and reinstall from app store for it to work.
Depending on what stuff we want to store in schema I'll try to think about other things. For example the fields we mentioned in chat might also require format (number/date format), possible values for pickers or additional icons as well. We might also want to try storing layouts in schema, but I don't know if we want to experiment with UI much.
Bonus:
We could do either what you proposed in point 2 or we could have a switch on backend which would signal when to return new schema version. So if the switch was off and the client would ask for new schema, it would return the old schema anyway. After we gain significant adoption of latest build we can switch it on backend and people would sync up with newer schema. But it would be slightly misleading and we can do the same thing on client pretty much as you mentioned in 2.
It's fairly common to have "feature toggles". Also the functionality would be there all the time, just not visible immediately. If there were bugs in the new code, they could be visible even if we didn't share the new schema version yet. But of course there can be bugs that appear only after it's turned on, it depends on what kind of bugs.
Anyway if we notice increase in crashes after we publish a new schema, we can always just remove the schema again. It shouldn't be a problem because even if for example we added a new object and we remove the schema, the (possibly broken object which causes crash) would still be stored in DB, but it wouldn't appear to the user because the schema doesn't support it. Broken UI elements would be hidden again. I guess it should/could work.
Or we could have feature toggles separated from schema. The new schema would update immediately, but big new features could be enabled by another endpoint (for example new objects, ui components or whole screens). This way we could control individual features that cause crashes and turn them off until fixed without messing up the schema. We can also do it gradually - release a version, enable new schema, enable individual features. If there is a raise in crashes after individual features we can turn them off. If there are crashes after new schema we're back to previous case (try to remove the schema). And if there are crashes immediately after a release then we need to make a quick new release anyway.
It should be fine with Apple. As I mentioned in previous point, we'll have all the code in place and they can inspect it. We'll only enable/disable some parts of it remotely. What is against the rules is to download and execute external code which they can't review. We should not have any problems in our case.
Anyway if we are careful enough, then users won't ever see those messages or have those problems. If we need to introduce big new breaking changes, at least we'll be prepared to handle them somewhat graciously. If people don't want to update their apps then we can't really help them. It would be possible if they only used the iOS app (let them stay on current schema version forever), but if they switch between desktop/web and mobile clients then they just have to update to be able to use our newest features.