// vim: set noexpandtab:
The C library provides an interface to the Couchbase Cluster from a C application. It is used by many of our client libraries as lower layer which performs network I/O and protocol handling. It may also be used as a standalone client.
The library is known to function on Linux, Mac OS X and Windows. Other platforms may be work but are not routinely tested.
The library tries to be conforming to the C89 standard ("ANSI C") and as such should build with any C compiler.
In the basic usage of the client library, your application will establish a client instance and connect to it. Once the instance has been connected you may start peforming operations
In the initial examples error handling will be treated by a simple
die()
function which will print the error message and exit. In real applications there are various types of errors, some more severe than others, and should be handled by the application's own logic where applicable. Error handling will be discussed in later sections.
To create an instance, an lcb_create_st
structure must first be initalized
with the proper fields:
#include <libcouchbase/couchbase.h>
static void die(const char *msg, lcb_error_t err)
{
fprintf(stderr,
"%s: 0x%x, %s\n",
msg,
err,
lcb_strerror(NULL, err));
exit(EXIT_FAILURE);
}
void create_instance(void) {
lcb_t instance;
lcb_error_t err;
struct lcb_create_st crparams = { 0 };
/** Initialize the host to one of the cluster nodes */
crparams.v.v0.host = "1.2.3.4";
err = lcb_create(&instance, &crparams);
if (err != LCB_SUCCESS) {
die("Couldn't create instance", err);
}
}
Once your instance has been created it must first be connected. Since at the core the library is asynchronous the connection involves scheduling for it, and then waiting for it to complete.
err = lcb_connect(instance);
if (err != LCB_SUCCESS) {
die("Couldn't schedule connection", err);
}
lcb_wait(instance);
/** Instance is now connected! */
The host
field in the parameter structure may be one of the following
- A simple hostname or IP, e.g.
foo.com
- A semicolon-delimited list of hosts to try, e.g.
foo.com;bar.com
- A host-port pair (of which multiple entries may be passed), thus
foo.com:9000;bar.com:800
. Note that passing the port should only be needed if the cluster is not listening on the default port of8091
. This may happen if using a customized cluster (for example thecluster_run
script packaged with the server source code) or if there is an HTTP proxy layer between the client and server which listens on an alternate port.
Note that errors may take place during the initial connection attempt as well. We'll examine how to check for errors like this later on.
In a production deployment, an application will typically be using a non-default
bucket with a password. The following fields are available in the
v0
, v1
, and v2
creation options.
user
- The username for the bucket. At the time of writing this should either be empty or set to the name of the bucket itself. Do not useAdministrator
as the administrative account may not be used to access the data bucketsbucket
- the bucket to connect topasswd
- the SASL password for the bucket. As withuser
, do not use the administrative password. This password is the SASL Bucket password.
Key value operations in lcb are composed of three stages
- Schedule the operation
- Begin I/O on the operation
- Handle the callback which contains the operation's result
/**
* Operations are passed to libcouchbase as an array of pointers
* to operation structures
*/
lcb_error_t err;
lcb_store_cmd_t cmd = { 0 }, *cmdlist = &cmd;
/** Buffer and size to use for the key */
cmd.v.v0.key = "Hello";
cmd.v.v0.nkey = 5;
/** Buffer and size to use for the value */
cmd.v.v0.bytes = "World!";
cmd.v.v0.nbytes = 6;
/**
* Since store_cmd covers different mutation types, we must explicitly state
* which one we want here
*/
cmd.v.v0.operation = LCB_SET;
err = lcb_store(instance, NULL, 1, &cmdlist);
if (err != LCB_SUCCESS) {
die("Couldn't schedule storage operation", err);
}
/** Perform I/O on the operation */
lcb_wait(instance);
The full API of the operation structures may be found in
<libcouchbase/arguments.h>
To actually handle the response, you need to set up a callback for the operation. Note that callbacks are set per-instance (not per-operation), however a different callback must be provided for each distinct operation type. You typically should do this right after the instance has been created:
#include <inttypes.h> /** For PRIX64 */
static void store_handler(lcb_t instance,
const void *cookie,
lcb_storage_t op,
lcb_error_t err,
const lcb_store_resp_t *response)
{
if (err != LCB_SUCCESS) {
die("Error during store", err);
}
printf("Stored key %.*s with CAS=0x" PRIX64 "\n",
resp->v.v0.nkey, resp->v.v0.key, resp->v.v0.cas);
}
void create_instance(void)
{
/** .. previous code … */
lcb_set_store_callback(instance, store_handler);
}
The signatures for the callbacks as well as the API functions which can set them are found in
<libcouchbase/callbacks.h>
The library is asynchronous at its core. In addition to the fact that it must operate in asynchronous environments, it must also operate with and utilize multiple sockets simultaneously.
Almost all operations within the library are comprised of the following stages
-
Define the request structure itself; this means providing any input for the specified operation
-
Associate a user defined context with the request. This user defined context (which is known in the library as a
cookie
) can be used to associate a specific response received in a callback with a specific request. This is effectively an opaque void pointer that is never dereferenced or freed by the library. -
Schedule the operation along with its
cookie
in the API. This will place the request inside the library's internal queue. -
Wait for the operation to complete. The completion of the operation is signalled by the receipt of the callback. The callback is always invoked (either explicitly or implicitly) with the
cookie
associated with the request for the response passed to it. Waiting for the operation to complete may be done with a call tolcb_wait()
.
Note that the lcb_wait()
is intended primarily for synchronous environments
where there is no explicit event loop to which the calling code implicitly
returns control to; thus lcb_wait
drives the event loop until the operation
is complete. For asynchronous environments, the API calls schedule
asynchronous I/O operations and those operations will be performed as soon as
the event loop regains control, and thus there is no need to call lcb_wait()
Most operations return an lcb_error_t
status code. A successful operation is
defined by the return code of LCB_SUCCESS
. A full listing of error codes may
be found inside the <libcouchbase/error.h>
header.
In order to handle the errors properly, the application and developer must understand what the errors mean and whether they indicate a retriable or fatal error. In general a retriable error indicates that the failure was transient and that it may be recoverable in some point in the future, while a fatal error indicates more severe issues that would likely not be alleviated unless some other action was taken.
The lcb_errflags_t
enumeration defines a set of flags which are associated
with each error code. These flags define the type of error e.g.
LCB_ERRTYPE_INPUT
if this is a result of a malformed parameter passed to the
library, LCB_ERRTYPE_DATAOP
if this is an error code received from the server
if it was unable to satisfy certain data constraints (for example, a missing
key or a CAS mismatch) and so on.
The LCB_EIF<TYPE>
where <TYPE>
represents one of the errflags_t
flags may
be used to check if an error is of a specific type.
static void get_callback(
lcb_t instance,
const void *cookie,
lcb_error_t err,
const lcb_get_resp_t *resp)
{
if (err == LCB_SUCCESS) {
printf("Successfuly retrieved key!\n");
} else if (LCB_EIFDATA(err)) {
switch (err) {
case LCB_KEY_ENOENT:
printf("Key not found!\n");
break;
default:
printf("Received other unhandled data error\n");
break;
}
} else if (LCB_EIFTMP(err)) {
printf("Transient error received. May retry\n");
}
}
Success and failure depend on the context. A successful return code for one of
the data operation APIs (for example lcb_store
) does not mean the operation
itself was succeeded and the key was successfuly stored. Rather it means the
key was successfuly placed inside the library's internal queue. The actual
error code is delivered within the response itself.
Errors may be received when scheduling operations inside the library. If a
scheduling API returns anything but LCB_SUCCESS
then it implies the operation
itself failed as a whole and no callback will be delivered for that
operation. The library may also mask errors during the scheduling phase and
deliver them asynchronously to the user via a callback if e.g. implementation
constraints do not easily allow the immediate returning of an error code.
Conversely, if a scheduling API returns an LCB_SUCCESS
then the callback
will always be invoked.
Various settings may be retrieved and tuned via the lcb_cntl
interfaces. This
interface allows to modify various timeout, buffer size, and behavior settings.
The lcb_cntl
is modelled after the fcntl
and ioctl
interface in that it
is passed a mode
and operation
constant, and an operation-specific pointer
to serve as the operand.
To set the "operation timeout" for example, you may perform the following
lcb_error_t err;
lcb_uint32_t tmo;
/** Make the timeout 5 seconds */
tmo = 5000000;
err = lcb_cntl(instance, LCB_CNTL_SET, LCB_CNTL_OP_TIMEOUT, &tmo);
The first argument to lcb_cntl
is the instance which represents the current
handle on the cluster. The second is the mode of the operation which can either
be LCB_CNTL_SET
to modify a setting or LCB_CNTL_GET
to retrieve the
setting. The third argument is the actual setting to operate on - which in
this case is LCB_CNTL_OP_TIMEOUT
. Finally the last argument is the operand
for the operation. The type and format of this argument is specific to the
mode and operation specified.
Typically for the LCB_CNTL_GET
mode, data will be copied from the internal
setting into the argument provided, and in the LCB_CNTL_SET
mode, data will
be copied from the operand into the internal setting. Note that not all
operations support all modes and that some operations may not be available in
some instance/cluster versions.
The operation will return an LCB_SUCCESS
if successeful. If the operation is
not supported in the current library version, LCB_NOT_SUPPORTED
will be
retuned. Note that specific handlers may return specific error codes as well.
The full listing of all the opcodes and the specific arguments they accept can
be found inside the <libcouchbase/cntl.h>
header.
In order to retain backwards and forwards compatibility with older/newer versions it is recommended to reference the actual numeric opcode instead of the constant name itself. This way if the application is compiled against an older library version it will result in a runtime error rather than preventing compilation (unless of course you want this).
You may use the library's logging API to forward messages to your logging
framework of choice. Additionally you may also enable logging to the console's
standard error by setting the LCB_LOGLEVEL
environment variable.
Setting the logging level via the environment allows applications linked against the library which do not offer native support for logging to still employ the use of the diagnostics provided by the library; thus you may do something like
$ LCB_LOGLEVEL=5 ./my_app
and logging information will be displayed.
The value of the LCB_LOGLEVEL
is an integer from 1 and higher. The higher
the value the more verbose the details.
To set up your own logger, you must first define a logging callback to be invoked whenever the library emits a logging message
static void logger(
lcb_logprocs *procs,
const char *module,
int severity,
const char *srcfile,
int srcline,
const char *fmt,
va_list ap)
{
char buf[4096];
vsprintf(buf, ap, buf);
dispatch_to_my_logging(buf);
}
static void apply_logging(lcb_t instance)
{
lcb_logprocs procs = { 0 };
procs.v.v0.callback = logger;
lcb_cntl(instance, LCB_CNTL_SET, LCB_CNTL_LOGGER, &procs);
}
Note that the lcb_logprocs
pointer must point to valid memory and must not be
freed by the user once passing it to the library until the instance associated
with it is destroyed.
Additional diagnostic information is also provided by the error callback. The error callback is considered to be a legacy interface and should generally not be used. The error callback however does allow programmatic capture of some errors - something which is not possible with the logging interface (easily).
Specifically the error callback will receive error information when a bootstrap or configuration update has failed.
static void error_callback(lcb_t instance, lcb_error_t err, const char *msg)
{
/** … */
}
lcb_set_error_callback(instance, error_callback);
As a general prelude to timeouts, timeouts implemented in the library are implemented with the aim of preventing a single operation hanging or waiting indefinitely for an I/O operation to complete. Thus while typically timeouts may be exact, if they are not they will tend to fire sooner rather than later.
On the other hand, note that internally timeouts are implemented within an event loop (as opposed to a kernel-based signal timer). This means
- The timer can only fire when and if the event loop is active
- The timer may end up being delayed if it ends up being scheduled after other operations which themselves perform blocking timeouts or long busy loops. (Note that libcouchbase does not do this, but your own code may).
As an I/O library, libcouchbase has quite a few timeout settings. The main
timeout setting is known as the operation timeout and indicates the amount of
time to wait from when the operation was scheduled. If a response was not
received before this time period the library will fail the operation and invoke
the callback - at which point the error will be set to LCB_ETIMEDOUT
. The
operation to control this via lcb_cntl
is LCB_CNTL_OP_TIMEOUT
.
In addition to the operation timeout, there is also the bootstrap timeout which controls the amount of time the library will wait for the initial bootstrap. The bootstrap timeout is comprised of two timeout values. One value controls the overall amount of time the library should wait until a configuration has been received, while the other controls the amount of time per-node that the library will wait for until the next node is retried.
The first timeout represents the absolute timeout and is accessed via the
LCB_CNTL_CONFIGURATION_TIMEOUT
. As mentioned before this value places the
absolute limit on the amount of time the library will try to wait until the
client has been bootstrapped before delivering the error. Note that this timeout
only makes sense during the initial connection and has no effect threafter.
An additional timeout setting is the per-node bootstrap timeout. When fetching
the cluster configuration there are often multiple nodes which may function
as configuration sources. Configurations are initially retrieved once during
initializtion but may also be fetched later on typically when an error condition
is detected by the library or when the cluster topology changes. In such
situations the behavior is to start fetching the configuration from each
node in the list, in sequence. If the retrieval of the config from the first
node fails with a timeout (i.e. the node is unresponsive) the library will
proceeed to fetch from the next node and so on until all nodes are exhausted.
This setting, may be accessed via the LCB_CNTL_CONFIG_NODE_TIMEOUT
operation.
At the core of the library is the ability to discover the toplogy of the cluster. This topology is known as a cluster map or a vBucket map, or simply a config. The config provides the client with information about which nodes are located in the cluster, and which vBuckets belong to which nodes.
Each time the client is given a request for an operation, it will query the config to determine the appropriate node the request should be sent to, and then place the serialized request into that node's socket buffer.
During cluster topology changes, the client may receive or request configuration updates multiple times. These updates typically reflect a vBucket being moved to another node. During these changes it will also happen that the client will happen to forward data to the wrong server and may internally retry operations several times.
For servers until 2.5 the only way to retrieve the cluster configuration was
via the HTTP REST API endpoint. The client would make a request to
/pools/default/bucketStreaming/$bucket
and receive a sequence of
configuration information encoded as a JSON payload. ($bucket
here means the
name of the bucket).
With server 2.5 the cluster has added support for bootstrap over the
memcached
protocol which allows us to not require a dedicated socket for
retrieving the operation. From an API perspective this is mostly transparent,
but will be detailed later on.
If the client fails to bootstrap during the initial connect phase then the client is considered to be unusable and should be reconstructed with the proper list of hosts. The client will attempt to retrieve the configuration from each host and will fail when either all hosts have been exhausted or when the bootstrap timeout has been reached.
The cluster configuration may be obtained either the Memcached protocol (in what is called Cluster Configuration Carrier Publication, or CCCP) as well as via the legacy mode which utilizes a connection to the REST API.
Note that CCCP
is available only in cluster versions 2.5 and higher.
The default behavior of the library is to first attempt bootstrap over CCCP
and then fallback to HTTP if the former fails. CCCP
bootstrap is preferrable
as it does not require a dedicated "Configuration Socket", does not require the
latency and overhead of the HTTP protocol, and is more efficient to the internals
of the cluster as well.
It is recommended to explicitly enable or disable one of the configuration sources. The only reason why the library attempts CCCP and then HTTP is to support legacy environments where a < 2.5 cluster is available, and/or a mixed version cluster where some nodes may support CCCP and some may not.
To explicitly define the set of configuration modes to use specify this inside
the lcb_create_st
structure
The following snippet disables HTTP bootstrapping:
struct lcb_create_st crparams = { 0 };
lcb_config_transport transports[] = {
LCB_CONFIG_TRANSPORT_CCCP,
LCB_CONFIG_TRANSPORT_LIST_END
};
/** Requires version 2 */
cparams.version = 2;
crparams.v.v2.transports = transports;
crparams.v.v2.host = "foo.com;bar.org;baz.net";
lcb_t instance;
lcb_create(&instance, &crparams);
The transports
field accept an array of "enabled transports" followed by the
end-of-list element (LCB_CONFIG_TRANSPORT_LIST_END
). If this parameter is
specified then the library will only use the transports in the list which are
specified.
The enabled transports will remain valid for the duration of the instance. This means that it will be used for the initial bootstrap as well as any subsequent configuration updates.
Memcached buckets do not support CCCP. You must ensure that HTTP is enabled for them (which is the default)
During the lifetime of a client, its cluster configuration may need to change or be retrieved. Reasons for this include
- The cluster pushing a new configuration to the client
- Nodes being added, removed, or failed over to/from the cluster
- Client proactively fetching configuration due to an error
These changes are collectively known as topology changes or configuration changes. Note that as mentioned in the list, fetching a configuration may not always be the result of an actual toplogy change. For example if a certain node becomes unreachable to the client then the client will attempt to fetch a new configuration under the assumption that the unresponsive node has potentially been ejected from the cluster.
For proactive configuration fetches (such as in response to an error) there is effectively a responsiveness-for-efficiency trade off. If the client does not check for updated configurations often enough then it risks failing operations which could have succeeded if it knew about the updated topology, while if the client fetches the configuration too quickly it can cause a lot of traffic and increase load on the server (this especially holds true with HTTP). Additionally some server versions allocate quite a few resources for each HTTP connection and thus maintaining idle HTTP connections may also be more expensive than re-establishing them on-demand.
Rather than guess your hardware and infrastructure requirements, there are several settings in the library to help you tune it to your needs.
Each time a network-like error is encountered in the library (i.e. a timeout or connection error), an internal error counter is incremented. When this counter's value reaches a specified value the library fetches a new configuration and resets this counter to zero.
Additionally, the intervals between successive increments are timed, and if a call is made to increment the counter and the time between the last call exceeds a certain threshold then the configuration is refetched again as well.
Both these behaviors can be controlled via lcb_cntl()
. To modify the error
counter threshold, use the LCB_CNTL_CONFERRTHRESH
setting. To modify the
error delay threshold, use the LCB_CNTL_CONFDELAY_THRESH
setting.
Settings may be supplied to each specific source as well to tune it.
CCCP relies on being able to connect to a memcached instance listening
on a predefined port. Usually this is 11210
, but may be different in some
environments.
You may use the mchosts
field in the v2
sub-structure of lcb_create_st
to specify host:port
pairs for memcached nodes (that is, which memcached ports
are used by Couchbase). If specified these replace the host
field in the same
structure.
HTTP-bootstrap used to follow a model of maintaining a persistent connection to the cluster and relying on the cluster to push updates to it. As part of the migration to CCCP, the internal behavior regarding HTTP has changed as well, having the library utilize it as a pull-based configuration source. Nevertheless it can still function slightly as a push source.
Whenever the client has finished retrieving the configuration from the HTTP interface, rather than closing the socket immediately it will let the socket idle for a small amount of time. The goal of this idle period is to allow any subsequent pushed configurations from the server to arrive without having to re-open the socket.
To control this idle timeout, use the LCB_CNTL_HTCONFIG_IDLE_TIMEOUT
operation
for lcb_cntl
The library allows to "cache" the configuration settings in a file and then pass this file to create another instance which will be configured without having to perform network I/O to retrieve the initial map.
The goal of this feature is to limit the number of initial connections for retrieving the cluster map, as well as to reduce latency for performing simple operations. Such a usage is ideal for short lived applications which are spawned as simple scripts (i.e. CGI).
To use the configuration cache you must use a special form of initialization:
lcb_error_t err;
lcb_t instance;
struct lcb_cached_config_st cacheinfo;
memset(&cacheinfo, 0, sizeof(cacheinfo));
cacheinfo.cachefile = "/tmp/lcb_cache";
/** Access the normal lcb_create_st structure via the 'createopt' field */
cacheinfo.createopt.v.v0.host = "foo.com";
err = lcb_create_compat(LCB_CACHED_CONFIG, &cacheinfo, &instance, NULL);
if (err != LCB_SUCCESS) {
die("Couldn't create cached info", err)
}
If the cache file does not exist, the client will bootstrap from the network
and write the information to the file once it has received the map itself.
You may check to see if the client has loaded a configuration from the cache
(and does not need to fetch the config from the network) by using lcb_cntl()
int is_loaded = 0;
lcb_cntl(instance, LCB_CNTL_GET, LCB_CNTL_CONFIG_CACHE_LOADED, &is_loaded);
if (is_loaded) {
printf("Configuration cache file loaded OK\n");
} else {
printf("Configuration cache likely does not exist yet\n");
}
Memcached buckets are not supported with the configuration cache
The core goal of the library is of course to allow you to perform data operations against the cluster. Data operations are submitted to one of the API calls which will then convert your inputs into packets which are scheduled over the network. For these simple data commands you may provide one or more request structures and schedule them. If the scheduling is successful then you are guaranteed to receive a number of callbacks exactly equal to the number of requests scheduled.
Crucial to working effectively with callbacks and data operations are the following:
- You will typically need to pass your own
cookie
pointer to the API call. The cookie is your only way to associate the rest of the application code. - If you have multiple keys to deal with it is significantly more efficient to schedule them in bulk by allocating an array of commands an an array of pointers to those commands.
The following example shows how to store a key in the cluster, retrieve it, reverse its value, and store it again
/** vim: set noexpandtab: */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <libcouchbase/couchbase.h>
static void die(const char *msg, lcb_error_t err)
{
fprintf(stderr, "Got error %s. Code=0x%x, %s\n",
msg, err, lcb_strerror(NULL, err));
exit(EXIT_FAILURE);
}
static void get_callback(
lcb_t instance,
const void *cookie,
lcb_error_t err,
const lcb_get_resp_t *resp
);
static const char *key = "My Key";
static const char *value = "My Value";
int main(void)
{
lcb_t instance;
lcb_error_t err;
lcb_store_cmd_t s_cmd = { 0 }, *s_cmdlist = &s_cmd;
lcb_get_cmd_t g_cmd = { 0 }, *g_cmdlist = &g_cmd;
char scratch[4096];
err = lcb_create(&instance, NULL);
if (err != LCB_SUCCESS) {
die("Couldn't create instance", err);
}
lcb_set_get_callback(instance, get_callback);
err = lcb_connect(instance);
if (err != LCB_SUCCESS) {
die("Couldn't schedule connection", err);
}
lcb_wait(instance);
s_cmd.v.v0.key = key;
s_cmd.v.v0.nkey = strlen(key);
s_cmd.v.v0.bytes = value;
s_cmd.v.v0.nbytes = strlen(value);
s_cmd.v.v0.operation = LCB_SET;
err = lcb_store(instance, NULL, 1,
(const lcb_store_cmd_t * const *)&s_cmdlist);
if (err != LCB_SUCCESS) {
die("Couldn't schedule store operation!", err);
}
lcb_wait(instance);
g_cmd.v.v0.key = s_cmd.v.v0.key;
g_cmd.v.v0.nkey = s_cmd.v.v0.nkey;
err = lcb_get(instance, scratch, 1,
(const lcb_get_cmd_t * const *)&g_cmdlist);
if (err != LCB_SUCCESS) {
die("Couldn't schedule get operation!", err);
}
lcb_wait(instance);
/**
* Inside the get callback we've copied over the value data from the
* callback into the scratch buffer which we passed as a cookie
*/
{
char reversed[4096];
int ii;
int curlen = strlen(scratch);
curlen--;
for (ii = curlen; ii >= 0; ii--) {
reversed[curlen-ii] = scratch[ii];
}
s_cmd.v.v0.bytes = reversed;
s_cmd.v.v0.operation = LCB_APPEND;
/** Since it's reversed there's no need to re-set the size again */
err = lcb_store(instance, NULL, 1,
(const lcb_store_cmd_t * const *)&s_cmdlist);
if (err != LCB_SUCCESS) {
die("Couldn't schedule second APPEND", err);
}
lcb_wait(instance);
}
/** Now get the key back again */
err = lcb_get(instance, scratch, 1,
(const lcb_get_cmd_t * const *)&g_cmdlist);
if (err != LCB_SUCCESS) {
die("Could not schedule final get operation", err);
}
lcb_wait(instance);
printf("Buffer in server is now %s\n", scratch);
lcb_destroy(instance);
}
static void get_callback(
lcb_t instance,
const void *cookie,
lcb_error_t err,
const lcb_get_resp_t *resp)
{
char *target = (char *)cookie;
if (err != LCB_SUCCESS) {
die("Got error inside get callback", err);
}
memcpy(target, resp->v.v0.bytes, resp->v.v0.nbytes);
target[resp->v.v0.nbytes] = '\0';
}
Compile as
$ cc -o example example.c -lcouchbase
$ ./example
Buffer in server is now My ValueeulaV yM
The above example program does several things in sequence. First it creates an
instance. Note that the lcb_create_st
structure is not passed. In lieu of this
structure it is assumed you are connecting to the default
bucket and the host
is localhost
.
After instantiation, the initial connection is performed via a call to
lcb_connect()
. To actually apply the connection, lcb_wait()
is called.
The program will then try to store a key, retrieve its value, append the reversed form of the value to the existing value, and finally retrieve the value again.
Command structures are not owned by the library when passed to them, thus the
memory pointed to by the command structure, the command list, and the key/value
data need only remain valid for the duration of the scheduling API (i.e.
lcb_store()
, lcb_get()
). Once the operation completes any relevant data
will have already been copied out of the buffers and they may be freed or reused
for other operations.
Callbacks for operations return the data inside a response structure. The
pointers received within the callback (except for the cookie itself) are
valid only within the duration of the callback itself. If you wish to use
the data outside the callback then you must copy the data elsewhere as
demonstrated in the example where the cookie is a char[4096]
which is the
target for memcpy()
.
Operations may be scheduled in bulk. Bulk operations will optimize and decrease network usage by packing many commands into a single buffer. Typically bulked operations will not actually save raw transfer sizes but will substantially decrease TCP overhead.
Currently there are two ways to schedule operations in bulk. The first way
involves creating a list of commands and passing it to the library in a single
call and the second involves calling the scheduling API with a single command
and finally calling lcb_wait()
when all commands have been scheduled.
Scheduling with a command list:
#define NUM_COMMANDS 10
int ii;
lcb_store_cmd_t cmds[NUM_COMMANDS];
lcb_store_cmd_t const* cmdlist[NUM_COMMANDS];
lcb_error_t err;
for (ii = 0; ii < NUM_COMMANDS; ii++) {
char *kbuf;
char *vbuf;
kbuf = malloc(128);
vbuf = malloc(128);
sprintf(kbuf, "Key_%d", ii);
sprintf(vbuf, "Value_%d", ii);
cmds[ii].v.v0.key = kbuf;
cmds[ii].v.v0.nkey = strlen(kbuf);
cmds[ii].v.v0.bytes = vbuf;
cmds[ii].v.v0.nbytes = strlen(vbuf);
cmds[ii].v.v0.operation = LCB_SET;
cmdlist[ii] = &cmds[ii];
}
err = lcb_store(instance, NULL, NUM_COMMANDS, cmdlist);
for (ii = 0; ii < NUM_COMMANDS; ii++) {
free(cmds[ii].v.v0.key);
free(cmds[ii].v.v0.bytes);
}
lcb_wait(instance);
Note that two arrays are required here. One array is required to hold the command structures themselves, and the other array is required to hold the pointer list for the command structures.
Note that in this mode if the scheduling function returns with an error
then no callback will be invoked. If it returns with LCB_SUCCESS
then exactly
ten callbacks will be invoked.
An alternative approach is scheduling within a loop like so:
#define NUM_COMMANDS 10
int ii;
lcb_error_t err;
for (ii = 0; ii < NUM_COMMANDS; ii++) {
char kbuf[128];
char vbuf[128];
lcb_store_cmd_t cmd = { 0 }, *cmdp = &cmd;
sprintf(kbuf, "Key_%d", ii);
sprintf(vbuf, "Value_%d", ii);
cmd.v.v0.key = kbuf;
cmd.v.v0.nkey = strlen(kbuf);
cmd.v.v0.bytes = vbuf;
cmd.v.v0.nbytes = strlen(vbuf);
cmd.v.v0.operation = LCB_SET;
err = lcb_store(instance, NULL, 1, (const lcb_store_cmd_t * const *)&cmdp);
}
lcb_wait(instance);
Here the operations are implicitly scheduled in bulk, relying on the fact that
no actual network I/O is performed before lcb_wait()
is called.
Both methods have their own benefits and drawbacks. The first method simplifies error handling by allowing a single error code to dictate the success or failure of scheduling all the operations, while the second method must figure out what to do if one of the commands failed to schedule.
However the first method requires much more memory allocations than the latter as it requires all key/value data to remain valid until all them can be passed to the library. Additionally it requires that a dedicated pointer array exist as well.
How you batch operations and how many operations you batch will depend on your use requirements and network speed. Typically TCP stacks will only be able to buffer so much data (usually several hundred kilobytes) before completely filling up the kernel's write buffer. Data batched beyond that will just slow down the library causing it to allocate and buffer more data than needed.
It is strongly recommended you use the batched operation mode if you have multiple items to send/receive
There are no limitations on the format of the key and value data (other than the key must not be empty). It is recommended that the key be a UTF-8 compatible sequence without any whitespace.
Currently key sizes are limited to 150 bytes and value sizes are limited to 20 MB. Consult the server manual for your server version to get more exact information on these constraints.
You may make use of the flags
field which is provided as input to the
lcb_store_cmd_t
structure (for storing data ) and provided as output to the
lcb_get_resp_t
structure (when it is fetched) to provide extra hints and
metadata about the key-value pair.
Traditionally (and currently) the flags are used to indicate the format of the data. Currently there is no standard for the flags format, but several clients rely on client-specific flags values to determine the value of the data. Before you use this field, check to see if you are operating with any other client/SDK and ensure that your usage of the flag field will not conflict with it. Additionally note that clients may extend their usage of flags as well.
This terminology refers not to the data operation of the command but rather the callback semantics. Most commands are Simple; that is you receive a single callback for a single request. Other commands are however more complex and invoke a callback multiple times per request. These commands are finished with a so-called NULL-Callback which indicates that no more callbacks will be invoked for the command.
Read the callback documentation carefully for each of the APIs to see how to handle the received callbacks.
In addition to the memcached protocol, this library also provides the ability to issue HTTP queries. HTTP queries may be issued to port 8092 and require a path.
Note that currently the library does not include utilities for constructing the view queries themselves. Specifically the library provides these features
- Selecting a proper node for a query
- Utilizing the asynchronous I/O layer as the rest of the library
- Receiving the response either as a series of callbacks for each received chunk, or all at once when the entire response has been buffered.
To create an HTTP request you will need to create an lcb_http_cmd_t
. In
addition to the command structure you will also be given an
lcb_http_request_t
which may be used to cancel the request later on.
lcb_http_cmd_t htcmd = { 0 };
lcb_http_request_t htreq;
lcb_error_t err;
const char *path = "beer-sample/_design/beer/_view/brewery_beers";
htcmd.v.v0.path = path;
htcmd.v.v0.npath = strlen(path);
htcmd.v.v0.method = LCB_HTTP_METHOD_GET;
err = lcb_make_http_request(
instance,
cookie,
LCB_HTTP_TYPE_VIEWS,
&htcmd,
&htreq);
if (err != LCB_SUCCESS) {
die("Couldn't schedule HTTP request", err);
}
And the callbacks:
static void
http_complete(
lcb_http_request_t req,
lcb_t instance,
const void *cookie,
lcb_error_t err,
const lcb_http_resp_t *resp)
{
const char **curhdr;
if (err != LCB_SUCCESS) {
die("Got fatal error for request", err);
}
if (resp->v.v0.status < 200 || resp->v.v0.status > 299) {
fprintf(stderr, "Got non-success HTTP status: %d\n",
resp->v.v0.status);
}
for (curhdr = resp->v.v0.headers; *curhdr; curhdr += 2) {
printf("Got HTTP Header %s: %s\n", curhdr[0], curhdr[1])
}
if (resp->v.v0.nbytes) {
fprintf(stderr, "Got %.*s for response body",
(int)resp->v.v0.nbytes, resp->v.v0.bytes);
}
}
The http_complete
callback is invoked when the HTTP response is
complete. This either happens if an error has been detected or if the
server has sent all the data for the response. You can see the callback
via the lcb_set_http_complete_callback
(see
<libcouchbase/callbacks.h>
for the API).
Note that the callback should be ready to detect two error conditions. The first is an actual error in the response delivery, for example a network or timeout error. The second condition is a negative status reply by the HTTP server itself.
Also provided with the response is a list of headers inside a char**
array. The char array is structure in the form of
{ "header", "value"}
and so on, so iterate over the array incrementing
the current element by two each time.
HTTP Operations require error handling to ensure the scheduling of the
request does not return an LCB_EINVAL
. This may happen for example if
the path string is empty or contains illegal characters. Other error codes
may be returned if there are no current nodes available for a view request.
Various HTTP codes may be returned if the target path does not exist or the URL and/or body is not recognized on the server.
Data may be uploaded to the server using the LCB_HTTP_METHOD_POST
and
LCB_HTTP_METHOD_PUT
request methods. The method appropriate depends on
the specific REST API being used (for example, creating a new design document
will require a PUT
whereas sending extended view query parameters should be
done via a POST
).
The server may also require that you provide a Content-Type
header for your
data. Typically this should be application/json
.
By default the library will attempt to retry redirect responses (HTTP 3xx
codes) and forward them to the new server. The number of maximum redirects
the library will try to follow before returning an error may be specified
via the LCB_CNTL_MAX_REDIRECTS
setting via lcb_cntl()
. Redirects will
typically take place when a server is being ejected from the cluster as
a notification to the client that it should use a different node.
Instead of receiving the entire response in one gulp, you can use the
streaming (i.e. chunked
) mode inside the request, thus:
lcb_http_cmd_t cmd;
/** ... */
cmd.v.v0.chunked = 1;
/** ... */
Note that
chunked
does not imply the HTTP-levelTransfer-Encoding: chunked
directive of the same name.
With this option you will receive chunks of data as they arrive inside
a data_callback
. The prototype of the data callback is the same as
the completion callback, except that the body
and nbody
fields will
contain the number of bytes in the current chunk rather than for the
entire response.
When the response is complete, the completion callback will be invoked
and the body
and nbody
fields will be empty.
If you wish to use a different HTTP library you can still benefit from the
library's automatic handling of cluster updates. You may select a node
from the list of cluster nodes returned via the lcb_get_server_list()
function.
You may adjust the timeout for HTTP operations by modifying the
LCB_CNTL_VIEW_TIMEOUT
setting in lcb_cntl
.
Utilizing the OBSERVE
command (explained later) the client may
determine the replication and persistence status of a given item across
the cluster. For clients who which to ensure their application does not
proceed until a given item has been persisted/replicated to a specific
number of nodes, they may use the lcb_durability_poll()
API call to
poll the servers until either an interval is reached or the items have
been properly persisted/replicated to their respective targets.
static void
durability_callback(
lcb_t instance,
const void *cookie,
lcb_error_t err,
const lcb_durability_resp_t *res)
{
if (err != LCB_SUCCESS) {
die("Response was not successful for durability", err);
}
printf("Exists on master?\n", res->v.v0.exists_master);
printf("Item persisted to %d nodes\n", res->v.v0.npersisted);
printf("Item replicated to %d nodes\n", res->v.v0.nreplicated);
printf("Needed %d responses to poll\n", res->v.v0.nresponses);
printf("Conformance to requirements with err=0%x", res->v.v0.err);
printf("CAS of item (as exists on master): " PRIX64, resp->v.v0.cas);
}
static void
storage_callback(
lcb_t instance,
const void *cookie,
lcb_storage_t op,
lcb_error_t err,
const lcb_store_resp_t *resp)
{
lcb_durability_opts_t options;
lcb_durability_cmd_t cmd = { 0 }, *cmdp = &cmd;
if (err != LCB_SUCCESS) {
die("Got error in store callback", err);
}
cmd.v.v0.key = resp->v.v0.key;
cmd.v.v0.nkey = resp->v.v0.nkey;
cmd.v.v0.cas = resp->v.v0.cas;
options.persist_to = -1;
options.replicate_to = -1;
options.cap_max = 1;
lcb_durability_poll(instance, cookie, &options, 1, &cmdp);
}
The API for the structures may be found in <libcouchbase/durability.h>
.
The basic use model is to schedule a durability poll right after the item
has been stored. The lcb_durability_opts_t
structure defines the parameters
for the persistence and replication requirements. You can define these to
contain the minimum number of nodes you wish the item to be persisted
and replicated to. Note that the maximum number of nodes you may persist
to is the the least of the number of nodes in the cluster, or the number
of replicas plus one node. The maximum number of nodes you may replicate to
is the least of the number of replicas or the number of nodes in the cluster
minus one (to exclude the master).
You can also set the cap_max
field to true and then set the persist_to
and
replicate_to
fields to a high number (as they are unsigned values, you can
simply use -1
) and the library will auto-adjust for the desired number of
replicas which are available at the time of the request.
To actually specify the keys to poll for, use the lcb_durability_cmd_t
structure, employing one such structure per key (as with the other commands).
You should also set the CAS of the structure (if possible) so that you can safeguard against concurrent modifications of the same item.
Note that the command is performed via polling multiple nodes repeatedly.
Depending on application, network, and server load this may have negative
consequences on performance. To adjust the polling preferences set the
interval
field in the options structure. This will specify the time to
wait between successive polls of all the nodes in the event where the first
poll did not persist or replicate to the number of required nodes. A higher
interval means less network load (but a potentially longer time to receive
a response) whereas a lower interval means a quicker response time at the
expense of more calls to the network.
Additionally you may batch multiple items into a single durability polling
operation. At the expense of the added latency in waiting for all the responses
from the first (store) operation to arrive you can gain more efficiency at the
polling end by batching as many keys as possible into a single OBSERVE
request.
Since persisting a node across multiple servers may take longer than simply
storing the item to the master node's cache, durability operations have
their own timeout setting. You may modify the timeout inside the options
structure. If a specific timeout is not provided it will default to the
value of the LCB_CNTL_DURABILITY_TIMEOUT
lcb_cntl()
setting.
In addition to the durability timeout, the default durability poll interval
may also be adjusted using the LCB_CNTL_DURABILITY_INTERVAL
setting.
The OBSERVE
command may be used to retrieve some information about a given
item's replication and persistence status on a variety of nodes.
The following shows a sample usage of lcb_observe()
static void
observe_callback(
lcb_t instance,
const void *cookie,
lcb_error_t err,
const lcb_observe_resp_t *resp)
{
if (err != LCB_SUCCESS) {
die("Observe failed", err);
}
if (!resp->v.v0.nkey) {
/**
* We get an indeterminate number of callbacks. The final
* callback will always contain 0 for nkey and/or NULL for key
*/
printf("Got final null callback for %p.\n", cookie);
}
if (resp->v.v0.from_master) {
printf("Got reply from master\n");
} else {
printf("Got reply from replica\n");
}
printf("Got raw status code 0x%x\n", resp->v.v0.status);
printf("Got raw CAS 0x" PRIX64 "\n", resp->v.v0.cas);
/** Since LCB_OBSERVE_FOUND is 0x00 we can't do a bitwise comparison */
if (resp->v.v0.status == LCB_OBSERVE_FOUND) {
printf("Key exists in node's cache (but not persisted)\n");
}
if (resp->v.v0.status & LCB_OBSEVE_PERSISTED) {
printf("Key is persisted on disk\n");
}
if (resp->v.v0.status & LCB_OBSERVE_NOT_FOUND) {
printf("Key does not exist in cache\n");
/**
* An item can still be schedule for deletion from disk, but actually
* exist on the disk while being logically purged from cache
*/
if (resp->v.v0.status & LCB_OBSERVE_PERSISTED) {
printf("But key is still awaiting deletion from disk\n");
}
}
}
void
schedule_observe()
{
lcb_observe_cmd_t cmd = { 0 }, *cmdp = &cmd;
lcb_error_t err;
cmd.v.v0.key = "foo";
cmd.v.v0.nkey = 3;
err = lcb_observe(instance, NULL, 1, &cmdp);
lcb_wait(instance);
}
The response callback for the lcb_observe()
call deserves most detail. A
callback (set via lcb_set_observe_callback()
) will be invoked for each
node in the cluster which is either a replica or a master for the key. Once
all the callbacks have been received the library will send a so-called
'null callback' which contains the cookie, but with the response fields all
set to empty/NULL values.
The response structure itself will contain a from_master
field indicating
if this reply was received from a master or replica node.
The response will also contain the CAS of the current item as it exists on
the given node (if the item exists) and a status code containing various bits
set depending on whether the item has been persisted, replicated, both, or
neither.
As of version 2.3, the lcb_observe()
call may also be used to determine
if the item exists on master without requiring to perform
any specific kind of mutation or retrieval on the item. This is done by
setting the version
field of the command structure to 1 and setting
the options
field to LCB_OBSERVE_MASTER_ONLY
like so:
void schedule_observe()
{
lcb_observe_cmd_t cmd, *cmdp = &cmd;
lcb_error_t err;
memset(&cmd, 0, sizeof(cmd));
cmd.version = 1;
cmd.v.v1.key = "foo";
cmd.v.v1.nkey = 3;
cmd.v.v1.options = LCB_OBSERVE_MASTER_ONLY;
err = lcb_observe(instance, NULL, 1, &cmdp);
if (err != LCB_SUCCESS) {
die("Couldn't schedule observe request", err);
}
lcb_wait(instance);
}
Note that you will still receive two callbacks; one which contains the reply from the master, and one which will signal the final null callback.
While couchbase is a multi-master multi-replica cluster, only a single master can exist for a single vbucket at a given time. If the master goes down then master access for the keys it hosted become unavailable. While replicas get promoted to masters during a failover, for some time sensitive data and use cases it may not be reasonable to wait until a node is manually or automatically failed over.
You may read data from a replica by using the lcb_get_replica()
call.
This functions like a normal lcb_get()
except it reads data from the
replicas. Note that the data from a replica is not considered to be
authoritative as it may be older than data on the master. If the master
comes back online it may potentially be overwritten by a newer copy.
To invoke lcb_get_replica
you will need an lcb_get_replica_cmd_t
. The
command structure contains the normal key
and nkey
fields, but adds
two additional fields
strategy
. This field indicates how many replicas will be contacted. The options areLCB_REPLICA_SELECT
to contact a specific replica,LCB_REPLICA_FIRST
to contact each replica once until a successful reply is recieved, andLCB_REPLICA_ALL
to fetch a response from each replica and return all the responses from the userindex
. Only valid withLCB_REPLICA_SELECT
this selects the replica to contact.
In LCB_REPLICA_ALL
you will receive multiple get
callbacks for the
same key. You will not receive a final null callback for the item and must
thus use a counter to determine how many times the callback was invoked. This
can be found by using the value returned from lcb_get_num_replicas()
which
will return the number of replicas the bucket is configured with.