aappddeevv/workflow-mess-for-web-apps.md

## workflow-mess-for-web-apps.md

      
    Raw
  

              workflow-mess-for-web-apps.md
            
          
    Web App Development Issues

Web app development and workflow is pretty much a messy exercise that has few standards. I am trying to document the messiness and suggest one way around it that smoothly scales from a small project to a large project without using yet another workflow layer (YAWL).
I am assuming that you are using tools such as node, grunt and bower. While it is possible to use other tools such as makefiles or generators such as lineman or brunch, the issues that bring such complexity to the workflow process are built into the very nature of web protocols and application deployment model. Unlike java, which has standardized packaging and deployment models, web apps can take on a wide variety of packaging and deployment models based on several factors such as organizational process ("this is the way we do it"), bandwidth/latency constraints, dev debugging needs as well as application model (client heavy, SPA or server heavy).
There is a lively ecosystems to help manage this complexity as mentioned above, but they all have additional learning curves despite them having an attractive "run and go" "provision a skeleton" type feel to it. So the question is whether we can use the simpler (hah!) tools such as node, grunt and bower to be incrementally more robust without needing to employ additional workflow tools on top of these tools. Just like in the java ecosystem, once the number of layers grow, it becomes harder and harder to get line of sight into more complex configuration as your project grows. Also, the web dev workflow tools are still fairly immature. For example, a copy task using a plugin in bower typically does not check whether a copy is needed because the target is already up to date, it just performs the copy. Until this semantic richness is present, it will be difficult to have efficient layers. Adding this type of functionality requires significant effort as it suggests that additional state is created and maintained from one workflow run to another.
How to Think About the Problem

The issue around packaging and deployment can be seen by considering different, common scenarios for packaging and deploying a web application:

All third party js code should be concatenated to the app's js code then uglified for dev.
A subset of third party js code should be concatenated to the app's js code then uglified for dev.
A subset of third party js code should not be concatenated to the app's js code. Neither should it be uglified.
Some third party js code can never be concatenated or uglified with any other third party or app js code.
For some dev activities, the source map needs to be available along next to the js code.
For production deployment, most js code, both third party and app, should be concatenated and uglified. However, there are a few third party js files that should not be concatenated and uglified.
In place of third party js code, use the js files from a CDN for dev and prod.
You need to be able to switch the third party and app js files that should be concatenated and uglified together easily.
The directories where third party js files live are not uniform and the naming conventions to indicated development and minimized versions are not consistent.

The same description applies to the stylesheet (css) files as well.
These scenarios are highly non-uniform. Semantically, there are a few different types of "resource lists" based on the above scenarios, many more than that found in java programs.
Breaking Down the Complexity

There are a few ways to break down the above scenarios into a different types of "resource lists."
In the java world, resource lists have different needs. These lists are usually in highly structured and fairly uniform directory/package trees making "folder location" the basis of rules for the required processing:

Resources that require no processing (static) prior to packaging.
Resources that require a transformation prior to packaging.

Examples: Compilation by a java compiler, source to source transformations e.g. templates.


Resources that require multiple transformations prior to packaging.

Examples: java compilation and byte-code augmentation


Because the deployment model is highly uniform, switching in a new version of a third party library is fairly easy, merely specify the jar file with the latest version and ensure that the API did not change. This process is fairly easy compared to the web app world. Recent tools such as gradle and sbt make this a snap and include sophisticated "change" algorithms so you can execute the minimum amount of processing to keep the target up to date.
In the web app world, it is not uncommon to manually copy the version of the third party library you want directly into a folder and use that directly. But once you realize you need different library files for different scenarios, you wind up keeping the third party folder around and manually adjusting many, many paths in your html files.
Its messy. Unfortunately, there is no way around the messiness without using YAWL. It is doubtful that a highly structured packaging and deployment model will ever be developed for web apps similar to java. Textual source code and highly flexible packaging and deployment models (read this to mean high-variable) may always be the norm. Recently, html5 introduced new "import" and other features to help manage this issue and its all good functionality to have but it is not a cure-all.
But lets itemize the resource lists and there needed transformations and flexible configurations that are needed for a standard webapp. The resource lists are independent of any particular framework like angularjs.
A: Vendor js, css and html files that should be used directly.
B: Vendor js and css files that are allowed, if desired, to be concatenated and uglified together.
C: Vendor js and css files that require their associated source maps.
D: App js and css files that during dev, need to be kept separate from an concatenation and development process.
E: App js ad css files that can be concatenated and minified during development because (D) files are the focus.
F: App or vendor files that are templates and must be processed by a specific template processor.
Then on top of resource lists that need to be manipulated there is also the need to:
G: Flatten tree structures i.e. take 10 js files at different locations in a tree and create a single file in another tree location.
H: Preserve tree structure from source to target.
I: Remove trees entirely.
J: Remove specific files and source them from a CDN.
K: Move files around in the tree structure e.g. take a main.html in the views folder and set into the toplevel index.hmtl location.
L: Rename files (see J).
M: Ensure no name clashes.
Then you also have SDLC issues:
N: Update third party vendor libraries to the next version.
O: Fix bugs in your application's libraries.
P: Deploy to multiple environments including potentially pure desktop mode.
Q: Create dev and prod environment
R: Scale teams to work on different parts of the application and combine the parts together easily for a total build or CI process.
S: Incremental dev i.e. modify a file and only, as much as possible, cause a rebuild to update the target tree.
Because this complexity is hard to manage, you see many build/workflow scripts that have hard-coded third-party js library files added to a project, hard-coded tree structures, extremely brittle dependency management and fairly bad practices that do not scale to larger projects. All of these issues exist regardless of the use of module systems such as requirejs or other frameworks that operate at one level up from the physical source level. And despite the use of more advanced techniques, such as templating to control the variations, its still a hard problem to solve.
The common set of tools today, such as bower, node/npm and grunt do not let you easily manage this type of complexity. bower actually says that its an assumption (unopionated) package system. Unfortunately, because it is unopionated it lacks semantic richness. For example, its hard to identify the assets to use in the various scenarios above without relying on naming and directory conventions that may not be common across other packages. Of course, bower has some capabilities to override asset specification but it does not meet the need for flexibity of scenarios. To manage these complexities, more tooling has been created to handle this complexity by layering on yet more standards and layers of workflow processing. It is actually quite humerous that as soon as one tool comes out, another tool is created to manage the issues that the tool has for solving the end to end problem. Of course, using many "layers" is more in line with unix style thinking of piecing together many smal programs to make a large one--but too many layers from different authors causes other types of complexity. Because all of these issues are reflective of the wide diversity in application development needs, popular toolsets like bower and grunt solve some issues but often lead to brittle build systems that are little more than recorded "macros" of the manual activity.
However, nless the packaging and deployment model become more structured or best practices become widespread, its the way it works today and tomorrow. In other words--deal with it.
Taming Some of the Complexity

Taming complexity could be as easy as creating highly structured directory structures where the assets are in directories that have semantic meaning e.g. aDirForStaticJsFiles. To change the processing sequence for a particular js file, for example, you would move the file from one directory to another directory. Probably less than ideal. Another option is to create "resource lists" that have semantic meaning and being able to move resources or sets of resources between these lists. Then these resource lists can be processed differently. This idea is similar to theidea of wildcarding (./js/**/*.js) a directory for a bower copy task to copy files from one tree to another. These locations are often hardcoded into the configuration section of a grunt file versus specified as variables in the grunt script and they are often just strings versus path objects. Once hardcoded into the grunt configuration, they become ineligible for reuse in other tasks.
If your application is fairly simple and small, you probably do not need to tame complexity and if redeployment over time is not critical e.g. updating vendor libraries, then again, you do not have much complexity to tame. You can do almost anything that is hardcoded and its probably good enough to get the job done.
Let's consider an application that uses jquery, bootstrap and the newer polymer libraries (so we can get that nice html5 import statement). Let's assume that we will take all of bootstrap in one gulp and not try to piecemeal it in order to shrink js and less/css size. It will be clear in the formulation below that we can finely slice the bootstrap libraries if we want to. We'll also assume that its a corporate application and the use of CDNs is unnecessary.
We have the following resource lists:

jquery and bootstrap js files, each in different locations. We want source maps for dev but we need to concatenate and minify for prod:

bower_components/jquery/dist/jquery.min.js + jquery.min.map (we can still concat and minify even with the min files so we do not need the plain jquery.js file).
bower_components/bootstrap/dist/js/bootstrap.js (no map)
bower_components/platform/platform.js


Bootstrap fonts that should be available without transformation. These have to be place in a location that the css can find them. In bootstrap, that location is ../fonts relative to the document load root.

bower_components/bootstrap/dist/fonts/*


app specific js files that should not be concatenated or minified so we can debug them more easily (we will forget for a second that we could create a source map for all js files that have been concatenated and minified).

app/js/app.js


polymer vendor files that have to be included in the component definitions that live in separate html files. These mix html and javascript together as well as depend on specific js files. These should be included as is for simplicity and because they are used directly as-is as part of the import process in an embedded component.

bower_components/polymer/dist/polymer.html
bower_components/polymer/dist/polymer.js
bower_components/polymer/dist/polymer-body.html


app specific less file. These need to be transformed to css files and may include sub-css files including those from bootstrap's production css file. It is possible variation but not considered here that we could also need direct access to the bootstrap less files. We want to keep these separate for dev but concatenate and minify for prod.

app/styles/main.less


vendor css files that we want to include as is for dev but concatenate and minify for prod

bower_components/bootstrap/dist/css/bootstrap.css
bower_components/bootstrap/dist/css/bootstrap.css.map


app specific images files to be copied as is

app/images/*.png *.jpg etc.


We have not called out the need for CDN files at this point as we are assuming a corporate application and do not want to rely on CDNs for content for jquery or bootstrap. We have also not tried to concatenate everything, for example, even the HTML files can be concatenated and made more efficient.
We'll assume that our output tree for dev or prod is the same, although for prod, we would expect fewer files since many of them will be concatenated and minified.

dist - main distribution tree for dev or prod depending on what you are building

js - js files. Dev has many files and prod would have fewer
images
fonts - font files from bootstrap
views - html files of various sorts
views/polymer - the component imports needed for polymer components
views/components - html files for app specific polymer components
index.html - the main entry point for the web page
styles - css files. Dev has many files and prod would have fewer


We also see that even with this type of breakdown into resource lists, we have a brittleness creeping in. If we have a index.html file and it needs to include the js files for the app, under the dev tree, we need to include multiple js files but for prod, we may need just one e.g. app.min.js. Many workflows rely on some type of template processing to slice in the "list" specific to the dev and prod deployment. However, we can use html5 imports to control this. That means our core index.html file can stay the same and index.html can use a generic link rel="import" statement. We would generate the "file" that index.html imports with the proper list or do some other clever approach. The index.html file stays the same and the build/workflow system can determine what block to generate into the tree.
Is this import approach better than the template "slice?" It could be, especially for larger projects where the logic to generate the external logic could be complex which means running a template system to slice in the links for dev could be time-consuming in a highly iterative dev cycle. Using the import approach means that only the changed file, in this case the index.html, would have to be copied to the target tree versus regenerating the import list again which requires more processing. This trade-off may or may not be significant for your app and dev cycle. Using an import has different user-agent load processing paths than slicing content directly into the html file.
We also see that the location of the polymer files used in the templates require a set location and that means the location of the polymer files would be hard coded into the components. This is not unlike the java package system and in general a project would have to set standards around the location of these type of vendor files. However, a link import could also be used by projects to stabilize the code and not resort to templating/slicing. For example, an importable html file could be published to dist/views/polymer-imports.html and all components would know to import this file to get polymer element definition support.