Embedded - another term for the database being actually a library (and thus running in the same process as your application) as opposed to: Non-embedded - DB is run as an application
For a web app, just use a non-embedded DB like Postgres. You'll run into the limitations of something like SQLite too quickly.
non-embedded(hosted/ ex- postgres) -> where does it need to live in order to communicate between an api? Should a database live on it's own machine or on the same machine as your server?
It needs to be accessible to your backend. This might be over the internet or it could be on a private network. Eventually you'll want to run them on separate machines so you can optimise machine resources accordingly (using a different AWS instance type for each for example). For a side project, it's fine to be on the same machine. Even better, use a free Heroku addon and don't worry about how to configure the DB.
Imagine you are writing an employee of the month app, where companies can sign up. When the company signs up you create the first user for the company. If the user's information is invalid you don't want to create anything, but when you're trying to create the user you need the company to exist already. You can wrap both the insert statements (for the company and for the user) in a transaction, and then if either fails both will be rolled back. You can also use transactions for locking which prevents race conditions when updating records.
It means you get to write your DB queries in the language used by your app and not in raw SQL. It also guards against a plethora of security issues such as SQL injection and can often have other features, such as schema enforcement, type casting and relationship management.
If you take user input from a form and just use string interpolation to put it in an SQL query then anyone who enters carefully crafted SQL in your form can run whatever queries they like on your DB. See https://www.explainxkcd.com/wiki/index.php/Little_Bobby_Tables.
To guard against it, use parameterised queries. The easiest way to do this is to use an ORM. Using parameterised queries also gives you a performance benefit, as the DB can build a query plan once and run it each time that query is used rather than having to reconstruct the query plan each time.
Some columns are used in way more searches than others (primary keys are common, but you may also look up users by their email for example). To make these searches (much) faster you can add an index. This adds a performance cost when writing data as the index needs updating as well, but query speed is dramatically increased. Indexes also take up space, so ensuring you have the indexes you need and no unnecessary indexes is often important for good DB performance as you scale.
If the relationship you're modelling in the real world is a many-to-many relationship. You query things from them using JOINs. An ORM normally makes this easier.
A join lets you use multiple tables in a single query to combine data across multiple tables.
When you need to change the structure of your DB you use a migration to describe the change. This migration can then be run against your machine in development to change your DB enabling you to add the feature that depends on the DB change. When the feature is ready to go to production you can use the same migration to change the production DB in the same manner.
Yes. It depends on the DB as to how. MySQL and Postgres both support dumping the DB as a series of SQL statements that can be run to recreate the DB for example.
Various ways. You could take a DB dump and store that. For large databases this can be slow to restore though. You could also take a snapshot of the machine or the filesystem the DB is using. Filesystems like ZFS can make this a very easy and fast way to take backups.
Sharding is splitting your DB over multiple machines. By the time you need it you should be able to afford to pay someone who understands it.
Op Logs seem to be a MongoDB thing. All DBs have logs though and setting them up depends on the DB you're using.
If you have multiple instances of your web api running do you also have multiple instances of your DB (cloned?)
If you're running separate instances, for example if you support on site installations for each customer then they'll probably have a separate DB. If you're just running multiple instances of the API behind a load balancer then they'll share a single DB.
Most, if not all DBMSes support some form of replication. This allows one primary DB to send changes to one or more secondary DBs. In MySQL and Postgres these secondary DBs are commonly read only so data only needs to flow one way and they can be used for read intensive workloads to take load off the primary.