andykrohg/Containerization.md

## Containerization.md

      
    Raw
  

              Containerization.md
            
          
    Red Hat Application Migration ToolKit - Cloud Readiness Target/Containerization Transformation Path

Documentation distilled from the core ruleset here. Further information on RHAMT can be found on the product page here.
Cloud Mandatory

Mandatory items to be reviewed for a successful migration to a cloud environment.
Embedded Cache Libraries

This ruleset detects embedded cache libraries that may cause issues during migration.
Cloud readiness issue as potential state information that is not persisted to a backing service.
Java Remote Method Invocation (RMI) service

This is a ruleset for Java Remote Method Invocation specific rules for migrating to OpenShift.
The use of Java RMI denotes a tight coupling that is better avoid in a cloud environment. Java EE standard and loosely coupled protocols are recommended for backing services interactions.
Some examples are:

message-based communication (JMS) for asynchronous use cases
HTTP-based protocol or API (JAX-RS and JAX-WS) for synchronous use cases

In combination with load balancing, both options ensure scalability and high availability.
Java native libraries (JNI, JNA)

This ruleset detects Java Native Interface (JNI) and Java Native Access (JNA) code usage while migrating to a cloud environment.
The Java native libraries (JNI, JNA) enables Java code to call and be called by operating system native applications and libraries written in other programming languages (e.g. C, C++).

Review the purpose and check the compatibility of this native code usage. If the native code cannot be run in the cloud/container environment, a migration strategy in this regard should be defined.
The following options are relevant in this regard:

reuse and embed the native library/application to the cloud environment (e.g. in a JBoss module)
contact the vendor/provider of the native library/application
replace/remove/rewrite the used native library/application by a cloud-compatible equivalent

Also be evaluate whether to move native libraries to a JBoss EAP Module

How to load native libraries and access them via JNI in EAP
Is it supported to compile the JNI code as 32-bit shared libraries and use it in 64-bit compiled Java code?

Local Storage

This is a ruleset for local storage related suggestions for migrating to cloud environments.
Accessing a file on a local storage in a cloud environment is not safe because, inside a running container, an application can never assume that anything stored on disk will be permanently available because a restart (triggered by code deploy, config change, or the execution environment relocating the process to a different physical location) will usually wipe out all local (e.g., memory and filesystem) state.
There are different ways to improve the application based on what the file is used for:

logging: log to stdout [1] and use a centralized log collector to analyze logs [2]
caching: use a cache backing service accessed [3] via a URL or other locator/credentials stored in the config (see below)
config: store configurations in environment variables because they are easy to change between deploys without changing any code [4][5][6]
storing data: use a database backing service [3] in case of relational data or a persistent storage system [7][8][9][10]
temporary data: file system of a running container should be used to storing files only as a brief, single-transaction cache (e.g. downloading a large file, operating on it, and storing the results of the operation in the database)

References:

Twelve-factor app - Logs
OpenShift - Aggregating container logs
Twelve-factor app - Backing services
Twelve-factor app - Config
OpenShift - Managing Environment Variables
OpenShift - ConfigMaps
OpenShift - Persistent storage (Concepts)
OpenShift - Configure persistent storage
OpenShift - Blog post about persistent storage
OpenShift - Object Storage

Logging

This is a ruleset for logging-related topics when migrating and application to cloud environments.
Problems:

Logging to file system - Logging to individual files should be avoided in a cloud environment, as locally written log files may be lost on instance termination or restart.
Logging to Socket Handler - Socket communication is not suitable in a cloud environment which does not provide fixed communication target hosts.

Consider instead the following options:

usage of a centralized log management system
log to standard output (console) and let the cloud platform handle the output
usage of shared storage for log files


Twelve-factor app - Logs
Aggregating container logs

HTTP session replication (distributable web.xml)

This is a simple ruleset for detecting usage of data storage in HTTP session objects when migrating an application to a cloud environment.
Session replication ensures that client sessions are not disrupted by failovers of nodes in a cluster. Each node in the cluster shares information about ongoing sessions and can take over sessions if a node disappears. The <distributable/> tag inside the <web-app> tag of application’s web.xml descriptor file enables the application's sessions clustering [1].
In a cloud environment it has to be considered that the data in the memory of a running container can be wiped out by a restart (triggered by code deploy, config change, or the execution environment relocating the process to a different physical location)[2].
Different approaches can be followed:

review session replication's usage and make sure it is configured properly (effort 3) [3]
rearchitect the application and consider storing sessions in a cache backing service (effort 7) [6][7][8][4][5]
disable HTTP session clustering and accept its implications

The second approach (using a cache backing service) has some benefits:

Increased application scalability and elasticity. By offloading the session data off to a remote Data Grid, the application tier itself can be more scalable and elastic.
Session Persistence. By offloading session data to a remote data grid, the application itself will be able to survive EAP node failures since the a JVM failure will not cause the session data to be lost.
Session Data Sharing. If you have a requirement for multiple applications to be able to share session data, this solution might be able to solve that use case as well.

References:

JBoss EAP - Clustering in Web Applications
Twelve-factor app - Processes
OpenShift - JBoss EAP Clustering
Twelve-factor app - Backing services
Red Hat JBoss Data Grid for OpenShift
JBoss EAP - Externalize HTTP Sessions to JBoss Data Grid
Developer blog post - Externalize HTTP Session Data to the JBoss Data Grid
Developer blog post - Externalized HTTP Session in an OpenShift 3.9 Environment

Cloud Optional

Optional recommendations for a successful migration to a cloud environment.
Java API for XML-based RPC (JAX-RPC)

This ruleset focuses on Java Remote Procedure Call (RPC) aspects relevant while migrating to a cloud environment.
The Java API for XML-based RPC (JAX-RPC, JSR 101) is an API for building and consuming Web services and clients that used remote procedure calls (RPC) and XML. JAX-RPC has several limitations (no support for web service annotations, injection, handlers for its endpoints). JAX-WS superseded it in Java EE 5. The use of JAX-RPC denotes a tightly coupling that is better avoid in a cloud environment.
Possible alternatives are to switch to...

another HTTP-based protocol or API (JAX-WS, REST)
message-based communication (JMS) for asynchronous use cases

In combination with load balancing, both options ensure scalability and high availability.

Is JAX-RPC supported in EAP 6?
Should I use JAX-RPC in EAP 6?

Java Mail API

This is a ruleset for detecting Mail API usage when migrating an application to a cloud environment.
In a cloud environment, mail systems should be considered as backing services.

Ensure that the configuration of the underlying outbound mail connection is not environment-specific (e.g. no static IP, URL, property, credential, certificate...). In OpenShift, environment variables or config map could be used for this purpose.

Twelve-factor app - Backing services
Twelve-factor app - Config
OpenShift - Managing Environment Variables
OpenShift - ConfigMaps

HTTP Session data storage

This is a simple ruleset for detecting usage of data storage in HTTP session objects when migrating an application to a cloud environment.
The servlet container uses HttpSession to create a session between an HTTP client and an HTTP server. The session persists for a specified time period, across more than one connection or page request from the user. HttpSession.setAttribute method allows user information to persist across multiple user connections. Warning: As of Java Servlet Version 2.2, the javax.http.HTTPSession.putValue(java.lang.String name,java.lang.Object value) method is deprecated and replaced by javax.http.HttpSession.setAttribute(java.lang.String, java.lang.Object).
In a cloud environment it has to be considered that the data in the memory of a running container can be wiped out by a restart (triggered by code deploy, config change, or the execution environment relocating the process to a different physical location)[1]. Consider storing HTTPSession data to a cache backing service[2][3][4][5]
This approach has some benefits:

Increased application scalability and elasticity. By offloading the session data off to a remote Data Grid, the application tier itself can be more scalable and elastic.
Session Persistence. By offloading session data to a remote data grid, the application itself will be able to survive EAP node failures since the a JVM failure will not cause the session data to be lost.
Session Data Sharing. If you have a requirement for multiple applications to be able to share session data, this solution might be able to solve that use case as well.

References:

Twelve-factor app - Processes
JBoss EAP - Externalize HTTP Sessions to JBoss Data Grid
Developer blog post - Externalize HTTP Session Data to the JBoss Data Grid
Twelve-factor app - Backing services
Red Hat JBoss Data Grid for OpenShift

Socket Communication

This is a ruleset for detecting usage of socket communication (both client and server socket usage) when migrating an application to a cloud environment.
Problems:

Socket communication - Java sockets are internal end-points of two-way communications. They are defined by an IP address, port, and protocol (TCP/UDP).
Java NIO channel - Java NIO Channels are designed to provide for bulk data transfers to and from NIO buffers. They can be synchronously and asynchronously read and written.

Direct communication through sockets and channels is an anti-pattern in cloud environment because it is not a reliable and scalable way of interact with other systems.

Java EE standard and loosely coupled protocols like JMS, JAX-RS and JAX-WS are recommended for backing services interactions.
Twelve-factor app - Backing services