-
-
Save dsolt/3654060204b8e9251c2e4700bece97f5 to your computer and use it in GitHub Desktop.
Text for pmix logging server example
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The “hello world” version of a PMIx host environment is a bit more involved than writing a first program in most languages. However, we present a simple a PMIx host environment that provides a logging functionality comparable to printing “Hello World”. Since PMIx is targeted towards distributed and parallel computing systems, a PMIx host environment is normally a distributed application running across a collection of resources. Such a collection of processes implements what is called a PMIx universe in the PMIx standard. Before making the jump to a multi-processes host environment, we present a simple single-processes server. There are several important concepts that need to be understood before making the jump to a truly distributed host environment and this simple example will help to introduce those concepts. | |
Another simplification used in this example, is to only support processes which are started as tools. The term tool has specific meaning within PMIx to refer to a process that is not instantiated by the PMIx server and connects to the server using PMIx_tool_init. The PMIx design assumes that host environment includes a mechanism to launch processes. In a typical system, processes are instantiated through an interface provided by the host environment. Sometimes this is done in coordination with a resource manager or scheduler which are all loosely considered part of the host environment. Once such clients are created, additional processes can also be created by PMIx_Spawn(), but the primary way to create PMIx clients is through the scheduler/resource manager. Whether processes creation is driven by a client call to PMIx_Spawn() or through an interface provided by the host environment, the new processes must be registered with PMIx using PMIx_server_register_nspace(). This call requires detailed setup of data structures which, though important to understand, is a difficult way to begin learning the basic design of the host environment. Therefore, this example starts with support for tools which are processes started externally to the host environment. Since these are single processes without peers or specific resources assigned to them, they can be supported by simply assigning each one a unique namespace. | |
To understand the expectations of this trivial host environment, begin by examining the tool code that this server intends to support. Figure xxx shows a simple PMIx tool which will request, through PMIx API calls, the host environment to log data. The first PMIx call is to PMIx_tool_init. This call is required before a tool makes other PMIx API calls. The call will fill in information about the name PMIx has assigned to this tool. All PMIx clients (including tools) are assigned a unique identification based on a namespace and a rank. The structure used to hold this identification is a pmix_proc_t and it contains a string namespace and an integer rank. Processes which are launched as peers will be assigned the same namespace name but with rank values of 0 through n-1 where n is the number of processes launched together. Since tools have no peers, this call should set the rank to 0 and provide a unique namespace to distinguish this tool from other tools and clients. | |
The PMIx call being invoked by this example is PMIx_Log data. This call requests that the host environment log a string. The data parameter is an array of logging requests. For example, one could request to log something to stdout and log something else (or the same string) to the system log. This example requests to log a single string to stderr. The data is provided as an array of pmix_info_t and the PMIx standard explains the various ways data can be presented. As with most PMIx calls, an array of directives is also passed to the call with information that can influence the operation. To demonstrate this, the example calls PMIx_Log twice. The first call passes no directives and the second call includes a directive requesting that a time stamp be recorded with the log entry. | |
Finally, a call to PMIx_tool_finalize deregisters the tool. The host environment should avoid re-using the same namespace name again to avoid confusion between terminated and executing processes. | |
Now that we understand what we want the host environment to be capable of, consider the basic structure of the host environment. The code shown in Figure xxx has a very simple main() function that calls PMIx_server_init() and then looks for work to do. When a task returns an indication that it is time for the server to terminate, it will call PMIx_server_finalize() and exit. | |
The call to PMIx_server_init() will register the services that the host environment is capable of handling. The pmix_server_module_t describes the possible entry points that the host environment is making available. In this case, server_tool_connection() is the function that the PMIx server code calls when a tool has called PMIx_tool_init(). The function server_log() will be called when a client/tool calls PMIx_Log(). | |
The implementation of server_tool_connection is rather simple in our minimalistic server. All that it is required of it is establish a unique namespace name and return it by calling the callback function passed to it (pmix_tool_connection_cbfunc_t), and supplying the requested pmix_proc_t assigned to the tool and the opaque callback data provided. Unfortunately, the callback function server_tool_connection cannot directly implement these few lines of code. The design of the server code requires that the callback function be called from a different thread. This may seem unnecessarily burdensome, but most callback function will require more significant work by our host environment. A distributed server will often require sending messages off host and waiting for responses before the operation is completed. Therefore, even if it were permitted, we are better positioned to transition our simple server into a multi-process, fully functional, distributed server which will need to be capable of queuing requests by creating a simple queuing system. | |
The function server_tool_connection() will therefore create a task related to this request and the main work thread will be responsible for processing the task. The work that needs to be done is: | |
1) Assign a unique namespace name for this request | |
2) Call the callback function provided and return the chosen name | |
The example code assigns the unique namespace name in server_tool_connection and then creates and fills in a small structure to provide the main progress thread with sufficient information to complete the request by calling the callback function provided to server_tool_connection by the PMIx library. PMIx makes frequent use of callback functions and it common to be provided a callback function by the PMIx library along with an opaque callback data that is to be passed to the callback function when it is invoked. The caller does not need to understand or interpret this callback data, but the callback function will use it to understand the context in which the callback function was originally requested. To call the callback in the main threads, it must know which callback function needs to be invoked, the callback data that needs to be passed to it, and any results that need to be provided. Since we need to understand what the callback function signature looks like, the type of the callback function is also neeeded. The server_tool_connection function creates a structure (work_unit_t) to store this information. In the case of server_tool_connection, it is filled in with the callback function that needs to be called, the opaque callback data, the TOOL_CONNECTION type, and a pmix_proc_t (named newproc in the code) that contains the namespace and rank assigned by the server to the tool. The namespace in this example is simply the string "tool_" with a unique interger appended to it. | |
The work_unit_t structure is enqueued in the work queue from which the main progress thread will pass it to the server_process_work function. This function determines the type of callback function to which this work unit is related. In the case of TOOL_CONNECTION, the callback function that needs to be invoked is cast to a pmix_tool_connection_cbfunc_t function because we know that was the original type of the callback function passed into server_tool_connection. The pmix_tool_connection_cbfunct_t takes the following parameters: PMIX_SUCCESS or an error code indicating why it was not satisfied, the assigned pmix_proc_t, the opaque callback data. | |
Now that we have looked through the basic strategy of implementing a callback by looking at the simple server_tool_connection, consider the more interesting server_log function. This function follows the same high level operations of performing the requested operation and recording the necessary information in a work_unit_t structure for handling by the main progression thread. In this case, however, the handling of the operation is more complex since it must look through the data to see what to log and through the directives to see how to log it. | |
The first 'for' loop in the code looks at each directive and records information about the supported options. Unsupported options are ignored unless they are marked as required which can be tested using the PMIX_INFO_IS_REQUIRED macro. Specifically, this first loop checks to see if the user requested the server generate a timestamp, the caller provided a timestamp value, or if the source of the logger is provided. There are other options that servers implementing PMIx_Log functionality are required to support such as PMIX_LOG_SYSLOG_PRI and PMIX_LOG_ONCE, but it should be obvious how to extend this example to implement these required attributes. | |
The second 'for' loop looks at the data info provided. This simple server only implements support for logging to stdout and stderr, so if any other logging options are found it returns not supported. The PMIx standard requires that implementors of PMIx_Log also support PMIX_LOG_SYSLOG, PMIX_LOG_LOCAL_SYSLOG, and PMIX_LOG_GLOBAL_SYSLOG but we leave these as exercises for the reader. | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include <stdio.h> | |
#include <stdlib.h> | |
#include <assert.h> | |
#include <pmix.h> | |
#include <pmix_server.h> | |
#include <pthread.h> | |
#include <queue> | |
typedef enum { | |
TOOL_CONNECTION, | |
OP | |
} callback_t; | |
typedef struct work_unit_s { | |
void *cbdata; | |
union { | |
pmix_tool_connection_cbfunc_t tool; | |
pmix_op_cbfunc_t op; | |
} cb_func; | |
callback_t cb_type; | |
pmix_status_t cb_return; | |
void *cb_return_data; | |
} work_unit_t; | |
pthread_mutex_t callback_queue_lock; | |
std::queue<work_unit_t*> callback_queue; | |
int | |
server_process_work(work_unit_t *work_caddy) { | |
switch(work_caddy->cb_type) { | |
case TOOL_CONNECTION: | |
{ | |
work_caddy->cb_func.tool(work_caddy->cb_return, ((pmix_proc_t*) work_caddy->cb_return_data), work_caddy->cbdata); | |
free(work_caddy->cb_return_data); | |
} | |
break; | |
case OP: | |
{ | |
work_caddy->cb_func.op(work_caddy->cb_return, work_caddy->cbdata); | |
} | |
break; | |
default: | |
fprintf(stderr, "Server ERROR: Only Tool and Op callbacks supported at this time\n"); | |
break; | |
} | |
free(work_caddy); | |
return 0; | |
} | |
void | |
server_tool_connection(pmix_info_t *info, size_t ninfo, pmix_tool_connection_cbfunc_t cb_func, void* cbdata) { | |
static int next_tool = 0; | |
pmix_proc_t *newproc = (pmix_proc_t*) malloc(sizeof(pmix_proc_t)); | |
newproc->rank = 0; | |
snprintf(newproc->nspace, PMIX_MAX_NSLEN, "tool_%d", next_tool++); | |
work_unit_t *work_caddy = (work_unit_t*) malloc(sizeof(work_unit_t)); | |
work_caddy->cbdata = cbdata; | |
work_caddy->cb_func.tool = cb_func; | |
work_caddy->cb_type = TOOL_CONNECTION; | |
work_caddy->cb_return = PMIX_SUCCESS; | |
work_caddy->cb_return_data = (void*) newproc; | |
pthread_mutex_lock(&callback_queue_lock); | |
callback_queue.push(work_caddy); | |
pthread_mutex_unlock(&callback_queue_lock); | |
} | |
void server_log(const pmix_proc_t *client, const pmix_info_t data[], size_t ndata, | |
const pmix_info_t directives[], size_t ndirs, | |
pmix_op_cbfunc_t cb_func, void *cbdata) { | |
pmix_status_t result = PMIX_SUCCESS; | |
char timestamp[1024] = { 0 }; | |
int source_idx = -1; | |
for (int i = 0; i < ndirs; i++) { | |
if (!strcmp(directives[i].key, PMIX_LOG_GENERATE_TIMESTAMP)) { | |
struct timeval time_store; | |
gettimeofday(&time_store, NULL); | |
ctime_r(&time_store.tv_sec, timestamp); | |
} else if (!strcmp(directives[i].key, PMIX_LOG_SOURCE)) { | |
source_idx = i; | |
} else if (!strcmp(directives[i].key, PMIX_LOG_TIMESTAMP)) { | |
if (strlen(timestamp) == 0) { | |
ctime_r(& directives[i].value.data.time, timestamp); | |
} | |
} else { | |
fprintf(stderr, "Server WARNING: Dir Key of %s not supported\n", directives[i].key); | |
if (PMIX_INFO_IS_REQUIRED(&directives[i])) result = PMIX_ERR_NOT_SUPPORTED; | |
} | |
} | |
for (int i = 0; i < ndata; i++) { | |
if (strcmp(data[i].key, PMIX_LOG_STDOUT) && | |
strcmp(data[i].key, PMIX_LOG_STDERR)) { | |
fprintf(stderr, "Server WARNING: data Key of %s not supported\n", data[i].key); | |
result = PMIX_ERR_NOT_SUPPORTED; | |
} | |
} | |
if (result == PMIX_SUCCESS) { | |
if (strlen(timestamp)) timestamp[strlen(timestamp)-1] = ':'; // get rid of \n | |
for (int i = 0; i < ndata; i++) { | |
if (!strcmp(data[i].key, PMIX_LOG_STDOUT)) { | |
fprintf(stdout, "%s%s: %s\n", timestamp, directives[source_idx].value.data.proc->nspace, data[i].value.data.string); | |
} else if (!strcmp(data[i].key, PMIX_LOG_STDERR)) { | |
fprintf(stderr, "%s%s: %s\n", timestamp, directives[source_idx].value.data.proc->nspace, data[i].value.data.string); | |
} | |
} | |
} | |
work_unit_t *work_caddy = (work_unit_t*) malloc(sizeof(work_unit_t)); | |
work_caddy->cbdata = cbdata; | |
work_caddy->cb_func.op = cb_func; | |
work_caddy->cb_type = OP; | |
work_caddy->cb_return = PMIX_SUCCESS; | |
work_caddy->cb_return_data = NULL; | |
pthread_mutex_lock(&callback_queue_lock); | |
callback_queue.push(work_caddy); | |
pthread_mutex_unlock(&callback_queue_lock); | |
} | |
pmix_server_module_t mymodule = { | |
NULL, // in 4.x use server_client_connected2 | |
NULL, // server_finalized | |
NULL, // server_abort | |
NULL, // server_fencenb, | |
NULL, // server_dmodex_req, | |
NULL, // server_publish, | |
NULL, // server_lookup, | |
NULL, // server_unpublish, | |
NULL, // server_spawn, | |
NULL, // server_connect, | |
NULL, // server_disconnect, | |
NULL, // server_register_events, | |
NULL, // server_deregister_events, | |
NULL, // listener | |
NULL, // server_notify_event, | |
NULL, // server_query, | |
server_tool_connection, | |
server_log, | |
NULL, // allocate | |
NULL, // job_conrol | |
NULL, // monitor | |
NULL, // get_credential | |
NULL, // validate_credential | |
NULL, // iof_pull | |
NULL, // push_stdin | |
NULL, // fabric | |
NULL // server_client_connected2; | |
}; | |
int main(int argc, char *argv[]) { | |
pmix_info_t server_info[1]; | |
bool time_to_exit = false; | |
strcpy(server_info[0].key, PMIX_SERVER_TOOL_SUPPORT); | |
server_info[0].value.type = PMIX_BOOL; | |
server_info[0].value.data.flag = true; | |
if (PMIX_SUCCESS != PMIx_server_init (&mymodule, server_info, 1)) { | |
fprintf(stderr, "Server ERROR: PMIx_server_init failed\n"); | |
exit(1); | |
} | |
do { | |
if (!callback_queue.empty()) { // don't even try until we think something might be there | |
pthread_mutex_lock(&callback_queue_lock); | |
if (!callback_queue.empty()) { | |
time_to_exit = server_process_work(callback_queue.front()); | |
callback_queue.pop(); | |
} | |
pthread_mutex_unlock(&callback_queue_lock); | |
} | |
} while(!time_to_exit); | |
if (PMIX_SUCCESS != PMIx_server_finalize()) { | |
fprintf(stderr, "Server ERROR: PMIx_server_init failed\n"); | |
exit(1); | |
} | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include <stdio.h> | |
#include <pmix_tool.h> | |
int main(int argc, char **argv) | |
{ | |
pmix_proc_t myproc = {"UNDEF", PMIX_RANK_UNDEF}; | |
pmix_status_t rc; | |
pmix_info_t data[1]; | |
pmix_info_t directives[1]; | |
if (PMIX_SUCCESS != (rc = PMIx_tool_init(&myproc, NULL, 0))) { | |
fprintf(stderr, "PMIx_tool_init failed: %d\n", rc); | |
exit(rc); | |
} | |
fprintf(stderr, "Tool assigned namespace %s rank %d\n", | |
myproc.nspace, myproc.rank); | |
if (PMIX_SUCCESS != (rc = PMIx_Log(data, 1, NULL, 0))) { | |
fprintf(stderr, "PMIx_log failed: %d\n", rc); | |
} | |
PMIX_INFO_LOAD(&data[0], PMIX_LOG_STDERR, "Hello World", PMIX_STRING); | |
PMIX_INFO_LOAD(&directives[0], PMIX_LOG_GENERATE_TIMESTAMP, NULL, PMIX_BOOL); | |
if (PMIX_SUCCESS != (rc = PMIx_Log(data, 1, directives, 1))) { | |
fprintf(stderr, "PMIx_log with timestamp failed: %d\n", rc); | |
} | |
PMIx_tool_finalize(); | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment