Skip to content

Instantly share code, notes, and snippets.

@kergoth
Last active May 18, 2022 16:17
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kergoth/10737032a2e8438fbef0bd706d48f82a to your computer and use it in GitHub Desktop.
Save kergoth/10737032a2e8438fbef0bd706d48f82a to your computer and use it in GitHub Desktop.
OE/Bitbake thoughts and tasks

Long standing concerns/issues

These are things we've (or I've) wanted to improve, or actively disliked, for years, but dealing with them is never a priority.

  • Steep learning curve. Is this a natural consequence of our level of flexibility? Is there a way to make it clearer how the final metadata comes from multiple sources?

Maintainability

  • Tight Coupling amongst bitbake modules
  • Inconsistent, and at times unclear, API design
    • Parser. Related to the false flexibility of bb.parse in general. It was written as though changing file formats is a possibility, but switching formats without also changing bb.data and the rest would not be viable, they're too interconnected
    • Fetcher.
    • Metadata. Camelcase api on DataSmart, etc.

Usability

  • Inconsistent layer priority between recipes and config/classes. This is something I've disliked since the original layering implementation, but it doesn't seem to be a priority for others. Perhaps submission of the sort-layers sub-command for bitbake-layers may be the best step forward here.
  • No consistent interpretation of string values as other types. Official type conversion would aid this consistency
  • The file format is a mix of declarative and imperative
  • Does many things, not one thing well ala Unix philosophy
  • Recipes end up with too much knowledge of how bitbake does things, such as adding tasks.

Tooling

  • No OE-provided tool for working with multiple repositories and layers. Everyone has to choose their own, which leads to inconsistency in setup and tooling
    • Candidates for investigation
      • oe-bakery
      • kas
      • whisk

Layer Setup and Configuration

  • Repository management tools can be polarizing. Dare we adopt one, or roll our own based on the bitbake fetcher? I'm inclined to the latter, even though we'd be reinventing the wheel somewhat. One issue is the bitbake fetcher expects to not operate against an existing working copy, and we'd need the ability to align the current checkouts with the configuration the way other repository management tools do.
    • If we were to adopt an existing tool, there'd be the question of whether we solely call out to it and rely on the user having expertise with it, or hide those details.
    • I don't think we should require the user be an expert in repo or the like. Submodules would be the lowest hanging fruit were we to adopt an existing solution, but they have downsides. Kas doesn't really do any repo management, only calls out to external commands, so you'd still have to use submodule operations.
    • Both whisk and kas rely on external repository management tooling, with examples using git-submodule.

My biggest concern with what I'm thinking as a theoretical tool is use of bitbake configuration rather than YAML, as you lose the hierarchy for the configuration and we're stuck with bitbake file format limitations. I don't think this is a big issue for layer setup, but could be more of an issue for configuration, unless we make it easy to hook into that via python plugins instead. The use of yaml configuration files for kas, for example, means it's easier to kick off different product configurations than it is for a pure bitbake build setup, though we do have some options there. We can certainly still have various configuration fragments pulled in on an as-needed basis for the different build configurations, and could even leverage multiconfig for the various products to be built. Would that be enough? And how best to handle conditional local.conf adjustments?

I think .conf fragment inclusion would work fine for variants rather than use of yaml.

Consider kas and whisk capabilities and tradeoffs. Whisk's heavy reliance on multiconfig is useful, I rather like that, rather than placing stuff in local.conf, and they have bbclasses to ease that. Their support for BBMASK to deal with layer versions to avoid the need to alter bblayers is interesting, but limits support to certain yocto release versions. Still, use of multiconfig for project/product is quite nice, I'm not sure we need to keep generation of that using a yaml file rather than just creating the files. I guess it makes it slightly less error-prone, but it also hides bitbake implementation details. While reducing confusion, this has the potential to cause issues with user understanding of the project. I'd like to ease setting this up, likely with commands to quickly edit/create/copy/etc configurations, or potentially optionally support generation of the config from yaml.

This tool should create a project workspace which includes:

  • Cloned repositories, fetched by the bitbake fetcher
  • Product configurations which are used as multiconfig builds
  • Local config which only references products to build but isn't used otherwise, much like whisk
  • Ability to build without sourcing anything

This project should be designed to be source controlled, and discouraging the user from placing configuration in local.conf.

One question is how to handle conditional layer inclusion for multi-BSP builds with bad vendor BSPs. If multiconfigs are used then the bblayers is same across all of those configurations. We have the option of explicit layer inclusion/exclusion rather than including all which are cloned, with optional conditional inclusion, though this would require multiple bitbake commands under the hood. We could also go the BBMASK route, which whisk supports, where all but the current bsp layer are masked out. Alternatively, since this is intended to be git controlled, the user could branch whenever a problematic vendor layer is involved. Or, we can use multiple approaches, use the BBMASK method like whisk does, but the user could branch for older releases that don't support bbmask with multiconfigs.

While whisk is interesting, I think a lot of what it's doing can essentially be done via config file inclusion. For example, versions to control layer sets could be easily just included into a product config if you don't want to duplicate it, and the provided structure for this could be supplied in the template.

Is there a difference between cloning an existing project workspace to do builds and initializing a new project workspace to create new products, etc. Should this be a single command or multiple, or should it depend on the source repository as to whether it's a project or a template? Does there need to be differentiation? We'd probably want to support initializing a fresh setup from scratch, but where would that come from, and where would this plugin be installed? I think we could provide a default project workspace template on github or git.yoctoproject.org.

Part of my intention here is to retain high flexibility, but to provide sane default structure to it via templating. Kas and Whisk encode a certain amount of fixed structure inherent in the format of the yaml, though kas does provide more flexibility in the ability to pull additional yaml configuration from layers.

Actually, the way I'm thinking of this tool is almost like repo, except for the heavy reliance on plugins. Isn't there a project that does that already? It'd be less oe specific, though. Is that good or bad?

Example usage of a theoretical tool

The intention here is to have a host-installed lightweight tool which embeds a portion of bitbake but which is versioned independently from it, and once you have a project, it calls out to the underlying tooling for everything, no longer needing its own parsing mechanisms except for its own configuration file. This is the same as what oe-lite did with its oe-bakery tool, and I think it's a great option. We're limited more by using bitbake configuration files rather than yaml, but it avoids hiding the bitbake file format details from the user, which is a win, and is less divisive, I think.

Some commands are obvious:

  • repo-fetch: fetch repositories/layers of an existing project, realigning them with the config
  • clone: clone a project repository, then call project-fetch
  • repo-foreach: run a command within each fetched repository
  • init: emit the environment necessary to run bitbake and related tool commands, to be sourced/evaluated by the user's shell
  • exec, run, or shell: exe cute the specified command inside the environment without needing to source/evaluate it in your shell. This could just spawn a new shell in that context by default if no command is specified. See limactl shell in the lima project.

All other commands would be defined by the project configuration, or the user's personal tool config in their home directory, or the loaded plugins.

We should support both lightweight aliases for the subcommands and full fledged commands backed by python plugins, to be loaded by the configuration plugin search path, which should optionally include loading plugins from the fetched repositories.

The lightweight aliases will make it easy to add commands for subcommands of multiple underlying commands, ie bitbake, bitbake-layers, devtool, recipetool, etc.

We could ship a project configuration for this tool to allow for its usage in the ESDK.

oe clone git://foo.com/my-project.git
cd my-project

# light wrapper around setting up the env and calling `bitbake target`
oe build core-image-base

# set up the needed environment for a shell in this project, including PATH, etc, and ideally also completions
eval $(oe init zsh)
bitbake -e

# Example of usage of defined aliases
oe bitbake -e
oe devtool modify gcc
oe modify gcc

# This we could have an option to persist the change by altering not just the bblayers but also the project configuration,
# and could accept remote repos/layers.
oe add-layers ./my-layer
oe fetch-layerindex meta-gplv2

Commonalities

  • The tooling either wraps the underlying layer fetch/update tooling or doesn't do anything at all, assuming the layers are present.
  • The expectation is for a project/product repository which aims for configuration management for your product development, also moving away from user configuration in local.conf.
    • YOE uses site.conf.
    • Whisk uses a YAML file which generates multiconfigs for products.
    • Kas uses a YAML file which generates local.conf and bblayers.conf.
    • Autobuilder uses JSON configuration with inheritence to set up all underlying configuration.

YOE selects the MACHINE through which envsetup.sh you source, but that's all. Any further configuration goes in site.conf, so you'd need multiple branches or repositories to set up multiple builds for your needed products and CI jobs.

Requirements

  • Support cloning the repository that holds the tool configuration. Kas does not do this today.
  • Support plugins.
    • Support loading plugins from cloned repositories, either automatic or manual by configuring the plugin search path. Kas does not allow this today.
  • Allow use of a host-installed command or one provided with the project repository.
  • Avoid heavy boilerplate. Kas for example has to do ./kas-container shell ./project.yml -c 'bitbake -e recipe' just to run bitbake -e recipe, which is not friendly.

Random thoughts

  • I think we should minimize what must be done in the shell environment. This was half the point of having a mechanism like bblayers.conf, so bitbake would find its own context and set the vars we used to set in the environment.
    • Are any of the default vars in BB_ENV_PASSTHROUGH_ADDITIONS appropriate to be moved to bitbake?
    • For any of the oe-core default vars in BB_ENV_PASSTHROUGH_ADDITIONS which can't be moved to bitbake, can we alter it from the layer.conf, rather than the shell environment? No, this does not work, it's expected to be defined in the actual environment, presumably the environment is set prior to layer.conf parsing, and we can't alter that inheritence after the fact right now. I feel like this could be done, though.
      • Related: the oe-core default value includes the items in bb.fetch2.FETCH_EXPORT_VARS. I feel like these should either also be in the default whitelist or should not be in the default BB_ENV_PASSTHROUGH_ADDITIONS. Either these should be allowed everywhere, or just for fetch, but it should be consistent and done in the same place.
      • We really should be able to adjust the environment filtering from a layer, since the layer knows what elements it needs to be able to access from the host, what the HOSTTOOLS are, etc.

TODO

  • Look into how the autobuilder does things with its JSON configuration inheritence

Reference

Thoughts from the community

<sveinse> Since yocto/bitbake doesn't really approach any configuration management (except local.conf, site.conf, et.al), there will have to be made a framework on top of yocto. -- Depending on the circumstance of what and how the yocto artifacts will be used and deployed.
<sveinse> We have a top-repo which is a git submodule repo containing all the (pinned) layers and a build/conf dir with the specifics for that build. Each branch of this repo corresponds to a separate build lane. And on top of this is a custom build management system that manages build versions and are packaging the yocto images into customer deployable formats.
<vvn> kergoth: multiconfig and site.conf do it all already. I think it would be great that instead of making people write or copy a build/conf/local.conf, Yocto encourages people to define a project layer with all the build-specific crap: multiconfig + site.conf + various keys or certs files, etc. <git root>/meta would be an ideal path for a versioned project.

This highlights the usefulness of a "product" concept which is the integration of and tweaking of a particular machine/distro combination:

<vvn> kergoth: to give an example, imagine you have a beaglebone based product without sound, and another product with sound, both using your own distro. The alsa packages on the first product aren't needed, so would you remove the 'alsa' feature from your machine conf (because it is added to the TI soc configuration) or do you tweak the machine/distro features elsewhere?
<vvn> kergoth: I guess that's the grey area for machine configurations, where for a SoC you really expect the features to describes the capabilities, while for the configuration of the final product, you may describe the intended usage
3:18 PM <RP> The inheritance in the json works nicely
3:18 PM <kergoth> I was explaining to my wife yesterday how hard it is to do much of anything given how flexible we need to be and the vast number of use cases, none of which are listed anywhere, that we need to support :) Anything you add has to be able to work for eeeevvverything. or at least avoid limitations as much as possible
3:19 PM <RP> kergoth: this is why I keep putting off the layerconfig/setup stuff :/
3:20 PM <RP> kergoth: I think the best approach may be just to "standardise" on the several current standards and allow bitbake-setup to drive them somehow in a more generic way as they ultimately all do roughly the same things, just in different ways
3:31 PM <kergoth> I can see that. Perhaps we need a very lightweight tool that's easy to extend that can call out to underlying mechanisms with its subcommands, so you can easily customize the commands for your workflows, I think this is important, and it should be as easy as adding a prefixed shell script to the PATH the way you do with git.. and certain common elements to all of the current tools, such as creation of a product/project source 
3:31 PM <kergoth> control repository and moving toward configuration management for your product/project configuration, actively avoiding shoving build configuration into local.conf and an uncontrolled build directory.
3:33 PM <RP> kergoth: right, something like that

Independent research

  • Layer Tooling (Yocto)
  • Tools for layer/repo management
  • bitbake: Add support for remote layering. This adds support to bitbake to add layer uris directly in BBLAYERS rather than at a level above it. This is interesting. Could we utilize this for the higher level tool rather than adding a new configuration file? Have the project define the conf/bblayers.conf with urls, thereby aligning the two? The current implementation seems to limit to specifying a single layer under a given repository, so we'd have to get past that.

Tools

  • kas
  • whisk
  • wr-lx-setup. This tool generates a repo manifest and then uses it to fetch, then it configures sample configuration files and calls out to oe-init-build-env.
  • hopper. Untested, but it does seem like this can clone as well as configure.
  • yoe. This DISTRO goes the other direction, heavily relying on sourcing setup scripts that specifically set up configuration in a certain way, to avoid the user having to edit config files before they kick off a build. YOE also provides many shell functions to ease certain operations, which has some of the benefits of a subcommand-style tool, but in a more shell-reliant way. YOE focuses on product development, which is worth emphasizing, as I think that's something we need to support well with an official layer setup and configuration tool. YOE tries to use only existing standard tooling, hence the shell functions, etc, to avoid the user having to learn yet another new tool. There's value in that, I think, though I personally don't love having to inject things into my shell environment in general. YOE's use of site.conf rather than local.conf for system configuration is certainly an improvement on local.conf, and could be more easily source controlled. YOE_PROFILE is an interesting way to ease common setup, but it sort of feels like a hacky version of USE, and it makes me want to do the same via USER_FEATURES instead. It's an interesting approach, and I certainly like the simplicity and clarity, and I certainly like the fact that it doesn't obscure or obstruct learning how oe/yocto stuff works, that's the same as the issues I personally have with kas and whisk. What I don't love is the fact that the setup bits have to live in the project repository, so you'd have to fork yoe to set up your own project. Describing local.conf as being for specifics relative to the machine you're building on is quite interesting, and a good way to limit its scope. The included site.conf is rather messy, though, and I do wonder if everyone will just shove everything in there rather than making their own distro. I also don't like that layers can't inject additional functions for envsetup.sh. One upside is you don't have to install anything, but it's not particularly clear how you'd branch off from this for your own product in a sustainable way and get updates from yoe, unless you just let git merge handle everything.

I don't feel like use of the bitbake fetcher for layers rather than submodules isn't transparent.

Notes on kas

Pros
  • Plugins support
  • Subcommand-based
Cons
  • Cannot load plugins from layers
  • No 'clone' command
  • Too much boilerplate for commands
  • Hides things that should be visible, requires use of the yaml configuration

Notes on whisk

Pros
  • Product layer masking via BBMASK
  • "Product" concept
  • Matrix of product, mode, site, and version
  • Less is hidden than with kas, I believe, in that it drops you into an environment where you run normal commands, but as a result it's heavily environment-dependent much like the current setup scripts
  • Pretty simple implementation
  • Use of multiconfigs
    • Cross-multiconfig deploy dir handling
Cons
  • Requires inclusion of init-build-env with arguments
  • Hardcoded matrix of product, mode, site, and version
  • No subcommand-based interface with ability to add plugins for workflow
  • Requires use of the yaml configuration

Only configuration

Only layer setup / clone

  • git-submodule
  • git-subtree
  • repo
  • Combo-layer. This is used by poky.

Considerations

Bitbake

Code execution / traceback handling

RunQueue

I've often thought about simplifying the runqueue, but it's not a priority, and experimentation in that area was before sstate existed.

File Format

  • Add an operator to reset/clear pending operations on a variable explicitly, either in its own statement or via an additional assignment operator. This would be necessary for multiple cases, including the aforementioned "lazy" operator behavior. Richard had proposed this in the past.
  • Investigate making +=/=+/.=/=. operate in a "lazy" fashion by default.
  • Consider incorporating the event in an addhandler line for convenience, rather than explicit eventmask
  • From oe-lite
  • Prefer explicit hooks to logic in recipes. See also https://gist.github.com/kergoth/9d7d31999db83ab045db
  • https://www.openembedded.org/wiki/New_Bitbake_Syntax_Brainstorm

Parsing

Metadata / OE

Setup Scripts

Consolidated Old Docs

Bugs

  • Open a yocto bug for the cross-recipe eventhandler leakage, if it still exists, see https://gist.github.com/kergoth/11460892
  • Review and submit missing python dep fixes from https://gist.github.com/kergoth/7d2e9f6c7d0fd759d7f2
  • If you have a local git mirror tarball, but no local clone, and MIRRORS and PREMIRRORS are unset, the fetch will fail, as there's no logic to check for the mirror tarball already existing, lacking a mirror fetch method to call. Might be able to adjust the return value of localpath() the way I do for update, but that could have complications
  • codeparser doesn't handle non-call name lookups to the metadata

Enhancements

  • Experiment with use of unshare or https://github.com/containers/bubblewrap for bitbake builds, as we could possibly leverage mount namespaces to avoid relocation issues by always mounting source trees and possibly build output in a consistent place. /sysroot-components/? Update unshare is now used by bitbake to disable networking in non-networking tasks, but isn't currently mandatory. Using it for a mount namespace is still a possibility, but it'd be making this functionality mandatory to leverage it.
  • Knotty
    • Add a status bar
      • Build target (e.g. core-image-base)
      • Distro
      • Machine
      • Build progress bar
    • Add a separator between the running task display and the messages
  • apt-get/yum style "install" of bits from metadata into project area. This could really just wrap devtool modify..
  • all sources in git, our own repositories, no local patches

Metadata / oe-core

Old Incoming

Incoming / Unsorted from old docs

Workflowy

Old/Projects/Long Term OE Replacement Thoughts

  • Key attributes/features

    • Full distribution build from source
    • Self contained, doesn't care what the host distribution is
    • Constructs images from binary packages, builds the projects which provide the required binary packages
    • Collaborative packaging environment
    • SRC_URI / source handling / patching / etc
    • Crosscompilation
  • Source handling can go away in favor of use of SCM

  • Image construction could be done by tools like debootstrap, from binary packages, within a virtual machine (e.g. aboriginal linux) or on a target board

  • All that remains is taking a set of metadata+sources and traversing dependencies to build them all in the correct order, with a private sysroot, not the desktop's

todo-yocto.md

Features / Improvements

  • If the license priority bits never went upstream, see about doing so. https://gist.github.com/kergoth/1590028
  • Submit portable implementation of dirs/pushd/popd from https://gist.github.com/kergoth/5044367 as utils in oe-core for use in the metadata. The implementation uses space separator, so will break in paths with spaces, but we break there anyway.
  • Submit prepare_sources to upstream Yocto - https://gist.github.com/kergoth/9d7d31999db83ab045db
  • Add default PACKAGECONFIG value to base.bbclass from DISTRO_FEATURES, with an intermediate variable for the recipe to use if they want to extend the default
  • Is the packageconfig-backdel prototype approach worthwhile, or would it be better to have a standalone tool which emits PACKAGECONFIG_remove_pn-* in a .conf which can be included, or modified further? Is there any other obvious low hanging fruit wrt items to add to CONSIDERED? E.g. xz is probably worthwhile across the board. tcp-wrappers should really obey a distro feature in all cases, but this isn't the case at the moment
  • Leverage bits from populate_sdk_ext for archive-release
    • Nothing from populate_sdk_ext.bbclass will be of use to us, as it seems to be all specific to how they want the ext sdk constructed, but oe.copy_buildsystem and the scripts it calls will be of use.
    • Make use of oe.copy_buildsystem.{generate_locked_sigs,prune_lockedsigs,create_locked_sstate_cache} to populate the sstate cache for archive-release. We won't be shipping the locked sigs configuration, since our users are expected to be doing more with the product, but we can use a temporary file for it just to populate the sstate cache. Update we no longer ship sstate.
    • Look into the bitbake/layer handling. I checked into oe.copy_buildsystem.BuildSystem.copy_bitbake_and_layers, but it's of limited usefulness to us since we want independent tarballs for each component, rather than a monolithic archive. Still, could be worth using if we then archive up each file/dir in the directory it populates. One potential issue is, it moves non-corebase-relative layers to the root, so layers like meta-networking will likely end up in the root of the sdk rather than retaining their relative path from ${COREBASE}/... We should confirm that behavior, though.

Old TODO

  • random thoughts

    • a rootfs_tar could be doable if package_tar put the postinst into the first run script location on the target, then rootfs_tar does the whole try to run if successful then remove it thing for each first run script in the resulting rootfs. then for metadata, we could leverage pkgdata combined with a basic runtime dependency traverser to get all the needed tarballs
  • packaging

    • packagedata
      • Drop pkgdata as a means of communicating information between tasks, replacing the separate per-package-manager tasks in favor of something PACKAGEFUNCS-like, or PACKAGEFUNCS-based. package - read_subpkgdata() package_rpm - read_subpkgdata_dict() package_deb, package_rpm, package_tar, package_ipk: read_subpackage_metadata
      • Test implementation of pkgdata with persist_data, or directly using an sqlite3 database.
    • Move more common packaging functionality into oe.package
    • Make package.bbclass use oe.package.files

Parser notes

  • The parser shouldn't be responsible for the early set of TOPDIR. Either the metadata should set it in bitbake.conf, or the UI / convenience API which wraps the parsing should set it
  • The set and manipulation of 'FILE' could be done differently, perhaps via injection of additional Statement nodes

Performance improvement

  • Consider use of cython for components -- note, need to reduce the tight binding between bitbake modules
  • Consider reworking the current metadata interface to simply be a different API/representation of the parser AST
  • Document usage of meliae and heapy for memory profiling
  • Look closer at profiling -- Richard has mentioned he cannot seem to nail down the real underlying systematic performance problem
  • Try out-of-process logging to reduce the log write overhead
    • Note, we already sort of do this via the messages being sent to the UI, but the socket is shared for both logging and events. Determine if this is problematic, and attempt to determine how much time is spend waiting for log methods to return. Potentially consider adjusting the buffer size for the UI message queue
    • TODO: make tasks a no-op, then do a build with and without log records being pushed to the UI
      • Did this, the difference was pretty minimal, as a typical build doesn't send that many log messages to the UI, it seems. I wonder if the log file bits are an impact? Test those next.
  • Suspect worst memory usage offender during the parse is the depends_cache, as our cache is fully constructed in RAM before dumping due to parse time / performance
    • For oe-core, the RAM usage during the parse hits around 240 megs, then drops down to around 70 after the up front parse for the rest of the build
    • Looks like individual task processes have a typical unique set size of 34 megs
  • Task execution is still rather slow even when exiting the forked task immediately with os._exit() rather than exec_task, as richard pointed out
    • On my VM, it took 2.5 minutes from scratch to exec of pseudo-native/do_populate_sysroot with no-op tasks, and parsing just oe-core
    • Each task appears to take .6 to 1 seconds even if it parses but does not execute an actual task
      • For one task's parsing on this machine: 406377 function calls (401802 primitive calls) in 0.552 seconds

Potential longer term projects

  • Reorg and refactor recipes and classes to facilitate conversion between metadata file formats
    • Shift away from reliance upon certain imperative behaviors
      • Shift class inherit lines to the beginning of recipes
      • Drop control of bitbake behavior from recipes, in favor of shifting those to other places, and if necessary adding declarative hooks (e.g. a shell function which is called after unpack/patch but before configure/compile, to prepare the source tree)
  • Break up base.bbclass and bitbake.conf into a sane structure
  • Drop make handling from base.bbclass in favor of an explicit bbclass
  • Avoid unnecessary .inc usage, so the user doesn't have to chase the metadata around
  • Replace our toolchain bits in favor of offloading to crosstool-ng
  • Pare down the core, move some of the extraneous bbclasses elsewhere
  • Ideally, though it could be extraordinarily painful: drop the full sysroot ${prefix} for native/cross in favor of fixing things to actually be relocatable
  • Build in support for non-cross operation, if possible. It would of course need something along the lines of external-sourcery-toolchain to extract the libc from the host sysroot

Random OpenEmbedded thoughts https://gist.github.com/kergoth/9983394

Dislikes

  • BitBake is too monolithic. It's a beast. Not very in line with UNIX philosophy.
  • Metadata is scattered around, difficult to map what bitbake is doing to what came from where
  • Recipes are both declarative and imperative. Order matters, yet it doesn't. Some things are immediate, some are lazy. This leads to confusion.
  • BitBake behavioral logic is interspersed with real knowledge of the upstream package. E.g. flags in recipes
  • BitBake is still rather slow
  • The system is unable to do MACHINE=native builds very well. It expects cross-compilation
  • Massive overhead even to build something small. The number of tasks needed to build busybox is ridiculous. Admittedly, there are valid reasons for this, but we'd be better off shipping a known sane host environment rather than rebuilding the universe to satisfy our assumptions
  • Fetching is a nightmare, particularly SCM handling. The fact that it's impossible to build with BB_NO_NETWORK in some cases is terribly sad
  • BitBake's internals are a twisty maze with no exit. Modules are badly intertwined, causing maintenance headaches
  • Sleep learning curve in general, high complexity

Likes

  • Layered metadata facilitates independent maintainers, as well as making it easy to break up changes in a logical way, and avoid having to modify upstream when not necessary.
  • Easy to collaborate, due to our built in support for our orthogonal axes: distro, machine, image, and conceptual layers to handle variable specificity (global, distro, arch, machine, ..)
  • "Class" mechanism lets us abstract out common functionality from recipes to pare them down to the bare essential description of the upstream project (ideally)

WouldLike

  • More inline with UNIX philosophy

    • Split out a standalone fetch tool, ideally using documented, well designed URL schemes
    • Split out a tool which produces an image from a package feed (yocto has started in this direction already)
    • Possibly split out a tool which takes a recipe and builds it, or takes a reformatted recipe (e.g. in json) on stdin and builds it, with the result that binary packages are produced. We could consider use of parallel, ninja, tup, or redo to do the actual execution of this against all recipes, obeying dependencies
  • Purely declarative recipes

  • Some form of plugin mechanism to control what builder implementation is used, for the non-declarative components, with the ability for a recipe or layer to express what builder is required

  • Improved method for making local changes to source trees, without the overhead imposed by the fetch/unpack/patch mechanism

  • Improved filesystem structure, such that non-temporary files (output) don't go in tmp, to reduce confusion

  • Drop the DEPENDS vs RDEPENDS dichotomy, which is also a source of confusion

  • Some form of isolated sysroots to reduce potential cross-recipe contamination

  • In addition to handling input checksums, we should also respond to changes in the output of an operation, so a change to the input which doesn't affect the emitted binary packages won't result in rebuilding the dependencies of said package

  • Support for automatic interaction with a backing store for build results

  • Support for capturing useful information about a build and use it for subsequent builds. Not build output, but information which could be used for e.g. more intelligent scheduling

  • Some form of configuration UI, ala menuconfig. I don't believe this should be optional or postponed as it was today, as it's a clear advantage (just look at every comparison between buildroot and OE/Yocto)

  • Out of the box support for native, on target builds of the entire embedded linux system

  • Improved discoverability and inspection mechanisms. It should be easy to ask the system things like:

    • What configuration variables are available?
    • What recipes are available?
    • What recipes depend upon FOO?
    • What recipes does FOO depend upon?
    • What are the configuration variable semantics?
    • What events exist, and when are they fired?
    • What was built? (toaster is handling some of this now, for post-build examination)
  • Lower priority, but would be nice

    • Job dispatch across multiple machines

OpenEmbedded Usability Concerns

  • Usability Concerns
    • Inconsistency
      • Configuration variable semantics are not consistent.
    • Discoverability
      • What recipes are available?
      • What configuration variables are available?
      • What recipes depend upon ?
      • What recipes does depend upon?
      • What are the configuration variable semantics?
      • What events are fired, when?
    • Orthogonality
      • .conf vs .inc
      • include/require .inc vs INHERIT bbclass
      • Python code manipulating the metadata (or manipulating files) Perhaps we should kill anonymous python functions in favor of an event handler responding to ConfigParsed or RecipeParsed or similar.
        • Anonymous python function
        • Def'd python function, called from an inline python snippet
        • Event handler
      • Python tasks and def'd python functions Python tasks could be replaced by def'd python functions with a specific signature, then we'd have only one syntax for python functions (though it would be different than that of shell functions) Alternatively, allow specifying a function signature on the non-def python functions, and make those callable from other bits of python in the metadata.
      • Functions versus tasks A shell function is a task with no dependencies. There's no reason to make tasks separate from functions.
    • Misc
      • Event handlers require this common wrapper crap. Abstract that away by specifying the events in a variable flag. NOTE: If there was an events flag, "addhandler" could go away in favor of only calling python functions with the flag.
      • Anything very complex at all requires that you know python, so why not just let the user supply python modules to manipulate things, rather than doing so much in the metadata?

OpenEmbedded Usability Tasks

  • [-] Usability
    • Python function/task definition Two syntaxes where there could be one
      • Allow specifying function signature in python function/task syntax, and make those functions available to the other in-metadata python code.
      • Kill the 'def' python syntax
    • [-] Fix up event handlers Using event handlers is unnecessarily complex, requiring setup code and addhandler call.
      • [+] Add a "RecipeParsed" event
      • [+] Add an 'events' flag, which is used to control which events will be handled by this function.
      • Kill 'addhandler' entirely in favor of calling the functions with the flag.
      • Kill anonymous python functions in favor of python functions, with names, which include the "RecipeParsed" or "ConfigParsed" events in their events flag. There are too many ways for python code to manipulate metadata currently. This way, it's all via events.
    • Kill the separation between "functions" and "tasks" Uncertain, needs more thought.
      • Kill the task flag, every function is a task.
      • Remove the implicit 'do_' for the -c argument, let the user run any function they want to.
      • Rename 'addtask' to something appropriate, which indicates what it really does, adjusting dependencies amongst functions/tasks.
    • Move listtasks functionality into bitbake
    • Make listtasks functionality show 'doc' flag info (or python docstring info?) for tasks, so the user knows what they do. Perhaps hide any functions/tasks which don't define a doc, by default.

Ideas from other projects

  • Features from other projects we may want to add to OE/MVL6
    • Firmware Linux
      • Sanitized host bin directory w/ forced PATH This would make our builds slightly more deterministic.
      • Qemu launch script Ideally, this would both spawn a distccd and configure the target fs to use it, and spawn a minimal http server to make available an ipk feed. I think runqemu can do this now.
      • Do we want to provide a class which spawns a firmware Linux qemu and builds in it and captures the output, rather than crosscompiling?
      • Add & support using oneit
    • e2factory
      • Automatic interaction with a backing store for build results This is a more implicit, automatic method of handling prebuilt binaries / pstage packages, since it automatically deploys the results to a shared area if desired, for other engineers to leverage in their own builds.
      • Hash of build input used to access the build results e2factory has less to worry about, so theirs is less brittle than the MVL6 cfgsig. Needs further investigation.
      • Lua
      • Project-based

OpenEmbedded Possibilities

  • Filesystem / Image Creator
    • By default, operate against the most recent build in the current project, but allow the user to specify any project, including those produced for builds by the builder daemon(s)
    • Allow the user to quickly and easily
      • Add individual files from a recipe into the fs/image
      • Add binary packages from a recipe into the fs/image
      • Setup execution of user scripts at fs creation time and/or boot time
      • Allow manually executing a set of shell commands, and record those commands for later replay
  • Project Setup
  • Recipe Executor
  • Builder Daemon
    • SCM Monitoring for continuous integration
    • Support distributed operations
    • Ability to queue up local only builds
    • Projects / Tasks
      • Description
      • Dependency Handling
  • User interface(s) for the build daemon

Old OpenEmbedded Tasks

  • [-] BitBake
    • Change the anonymous function concatenation to prevent short circuiting. Currently, a single anonymous python function doing a 'return' can completely hose you. Possibilities: Move each anonymous python function into a function during the concat, and append the execution of those functions to the final concatenated string, or use the python ast module to scan and prevent 'return'.
    • Create a python module to act as a wrapper around subprocess/popen/system.
    • Kill the PID specific run/log script emission, just emit the current, overwriting the old. I can't think of a real use case for the other behavior. Making sense of the pid specific logs involved manipulating the output of ls.
    • Audit BitBake exception handling.
      • A ^C shouldn't dump the log when BBINCLUDELOGS is enabled, only other reasons for the exit should result in that behavior.
      • Remove all sys.exit / raise SystemExit instances from the python package.
    • Change the behavior of -v and -D
      • I think -v should be the one that adjusts level, and -D should just allow display of debug messages.  so -vvv -D instead of -DDD.  right now the behavior of -v isn't exactly obvious or intuitive to people
      • then you could do -vvv -l Parsing -l TaskData instead of -l Parsing -l Parsing -l Parsing -l TaskData -l TaskData -l TaskData ;)
    • Add a commandline argument which runs the specified task against all dependent recipes. Would make do_buildall, do_fetchall, etc obsolete.
    • Add an argument to specify which variables you want -e to emit.
    • Add better commands/arguments to list what will be built and why
    • Add whatprovides and whatrequires commands to the shell.
    • Create a proof of concept alternative to the 'bitbake' command, which provides a bitbake shell like interface, via the commandline.
    • Fetcher
      • bb.fetch has no real arbitration method. If multiple classes are registered which can handle the same urls, there should be a way for the user to enforce their preference
      • DL_DIR needn't be assumed in bb.fetch, it'd be nicer to pass the destination along, so other tasks could easily pull things down to alternate locations
      • Ensure that all fetchers use runfetchcmd.
      • Leverage the Python urlparse module.
      • Try a curl based fetcher
      • Try a urllib2 based fetcher Tried a preliminary version, it breaks on Darwin, due to it executing a function which isn't safe to run after fork but before exec, which is how our tasks are executed.
    • Implement "-=" for word separated variables. Update This was done as _remove.
    • Kill the forced re-export of the whitelisted environment variables. Explicit is better than implicit, particularly for variables which could break builds. The current behavior can actually break things. If you whitelist TOPDIR, now TOPDIR will be exported, even though it wouldn't have been had you set it in local.conf, or didn't override it at all.
    • [+] Make the cooker realize that if a task's value is entirely empty, or only whitespace, that it simply doesn't need to run it. Treat that circumstance as the same as the task stamp existing. Phil thinks this is, while not harmful, not particularly useful either. It's a good point, give this more thought.
    • [+] Messaging / Logging
      • [+] Add the 'default' messaging domain to the domain enum, so that it can be specified with -l.
      • [+] Deprecate bb.msg, convert all of bitbake to use logging directly.
      • [+] Leverage the Python logging module.
    • Parser
      • Add a method which uses the dispatcher to obtain the AST for the file in question, rather than current handle() which returns a datastore directly.
      • Clean up the dispatch.
      • Create a Semantic Model, based on the AST, which provides a nicer interface for operation on the datastore. cooker/runqueue shouldn't need to poke at the internals of how our datastore works. It wants to get the values of Variables to control its operation, and execute Tasks.
      • Once we have the AST dispatch, we can change the IncludeNode/InheritNode to do the inclusion/inherit when the file doing the including/inheriting is parsed, so the StatementGroup won't change its content at eval time.
      • Switch to object oriented, since it's pretty much ghetto OO right now anyway.
    • Run successfully under IronPython. Would be cool if we could use Pex to generate unit tests and stuff :)
    • There may be brokenness in the BBPATH mangling for relative include/require.
  • [-] Classes
    • collections.inc
      • Change the bb.fatal calls into raising of new exceptions with real meaning.
    • amend.inc
      • Make it load all amend.inc instances in the filespath, rather than only the first.
      • Perhaps add a toggle for this behavior, uncertain.
    • autotools.bbclass
      • If a configure script defines AM_MAINTAINER_MODE, we should pass --disable-maintainer-mode, to disable the makefile rules that cause autoconf, etc to be rerun. Our run of those tools is explicit, not implicit.
    • base.bbclass
      • Break up into multiple classes with specific purposes.
      • Kill all .la files put into place by do_stage or do_install. Per Richard Purdie, the Darwin linker sucks, so we can't do this, at least not for staging. Removing them from the target distribution would still likely be a Good Thing, however.
      • Make checksums.ini emission in tmp have an option to emit directly into the checksums.ini for that collection.
    • gettext.bbclass
      • Kill duplicate append to DEPENDS.
    • image.bbclass
      • Add a convenience function for ROOTFS_POSTPROCESS_COMMAND to run 'mklibs' against the root filesystem. May want to do that automatically for ONLINE_PACKAGE_MANAGEMENT = "none".
      • Make the -dbg image creation mechanism into a generic means of creating sets of images from a single image recipe Global config variable with a list of the types of images to be emitted, then a callback mechanism to register new image variants which manipulate the metadata for the execution of do_rootfs by each.
    • kernel.bbclass
      • Add a global toggle for the build/packaging of kernel modules. Yes, you could manipulate the defconfig to do that, but a toggle would be a faster way if you just don't care about the non-builtin bits for your platform, and want to speed up the build process.
    • [-] package.bbclass
      • [+] Move creation of -dbg packages into a python snippet + PACKAGES_DYNAMIC rather than PACKAGES, to enable automatic creation of -dbg packages for subpackages, not just the main one. Apparently this was shot down in the past, but I think the discussion should be re-opened.
      • [+] shlibsdeps & friends could check for missing DEPENDS/RDEPENDS based on the rdeps they automatically capture.
      • Write a task to expand all the FILES globs against ${D} and note all overlap which is occurring between packages.
    • packaged-staging.bbclass
      • Fix arch handling for pstage packages. Cross packages are dependent upon both build & target arch/os. What about 'any' arch for recipes that only install scripts?
      • package_stagefile_shell is broken when DEPLOY_DIR is outside TMPDIR, due to assumptions it makes about locations
      • Packaged staging chokes if you give it a packaged staging package, no tmp, and do a -c setscene instead of building BB_DEFAULT_TASK.
      • packaged-staging assumes that 'staging' and 'stamps' are sibling directories, and that both are in TMPDIR.
    • [-] patch.bbclass
      • [+] Add the ability to assume patch for .patch/.diff extensions, and ensure patch= has a way to disable patching entirely.
      • [+] Combine patch= and pnum=.
      • [+] Create a variable to hold the list of local patches to be applied, which by default is generated from SRC_URI. Patch.bbclass would then not touch SRC_URI directly at all.
      • Fix the output of the 'patch' PATCHTOOL. Right now it is overly verbose, and printing a python structure rather than the patch name.
      • Profile the patch methods, determine if we should scrap quilt as the default.
    • recipe_sanity.bbclass
      • We can probably remove the current logic in favor of leveraging d.dict as the keys in the recipe, and d._data.keys() as the configuration metadata. This would assume that we're using datasmart objects, of course.
  • [-] Config Files
    • [-] bitbake.conf
      • Add an override which is less specific even than pn-${PN}. Perhaps 'global', or 'general' or 'bitbake' or 'oe' or something.
      • [+] Make the "use ccache if it's installed" behavior opt-in, not opt-out. Pending push to upstream.
      • OE should look at removing the -L option from LDFLAGS and the -isystem from CFLAGS
  • Documentation
    • BitBake User Manual
      • Missing bits on ?=
      • Missing bits on EXPORT_FUNCTIONS and its motivation
      • Switch to docutils/sphinx, using reStructuredText, to make it easier to maintain
    • bitdoc
      • Add the ability to document classes and configuration files via comment headers, ala pydoc. Right now it's hard for a new user to know what classes are available, for optional bits of functionality (the ones we pull in via INHERIT rather than a recipe pulling them in to get a functional build).
      • Improve the html output
  • [-] General
    • Courtesy RP: Play with SRCREV, 'DISTRO="moblin-bleeding" in poky is a lovely toy for it'
    • Audit our bbclasses
      • Event handlers
        • Kill unnecessary 'return NotHandled'
        • Use isinstance() rather than getName().
      • Others may be able to be removed entirely
      • Some need cleaning up, updating to current standards
      • Split up into INHERIT based functionality and build classes used by recipes
    • Fix builds under Darwin/OSX.
    • Implement USE-like functionality with a different name.
    • Make it easier / cleaner to leverage B vs S
      • add an autotools_builddir to do the ${B} in the task dirs flags, remove that from base.bbclass
      • enable it in kernel.bbclass with EXTRA_OEMAKE = "'O=${B}'" or somesuch..
      • set B in bitbake.conf to a sane value != S
    • [-] Python Usage
      • [-] Best Practices
        • Decorators Can't use until BitBake/OE switch to python 2.5+
        • [-] Kill "for foo in somedict.keys():" in favor of "for foo in somedict:" unless somedict is going to change during the iteration.
          • [+] BitBake
          • MVL6
          • OpenEmbedded
        • Usage of setdefault and/or defaultdict? Unsure of the best practices here, give further thought.
        • [-] Usage of sorted() vs a .sort() on a temporary list
          • [+] BitBake
          • MVL6
          • OpenEmbedded
        • Use generators and iterators where applicable.
      • Language & Library Capabilities
        • Decorators may be useful here and there - requires python 2.5 :(
        • Leverage shutil. I know there are places where we fork off a 'cp' with system() rather than using the quite functional shutil.copy2().
          • foundation alone has 2 that could be changed to use shutil.
    • Revamp {pre,post}{inst,rm} handling
      • Areas to capture directly in metadata
        • Alternatives
        • New user(s) / group(s)
        • Permissions / Ownership
        • Service handling
    • Revamp inetd/xinetd configuration handling
    • Revamp startup script / service handling
    • [-] Rework staging
      • Aspects of the capturing process:
        • We need a shell script or diff which mangles the captured output.
        • We need to also capture and use runtime dependency information.
      • Can probably scrap the whole layout_* split and all.
      • Goals
        • Don't require rebuilding a project in order to change its packaging?
        • Make builds more deterministic by ensuring that recipes can only get to the bits they say they depend upon, and nothing else.
        • Prebuilt support from the get-go.
      • [-] Multiple phase approach?
        • Add task for population of private staging areas, kill populate_staging, or move it.
        • Don't include stamps in the pstage packages, instead leveraging 'check' flags on the tasks to avoid their execution when the pstage package exists already. If we do this, we can kill the setscene stuff.
        • Implement packaging based on the staging package contents, rather than calling do_install itself directly, and remove the binary packages from the pstage packages.
        • Kill do_stage, instead populating the staging package using do_install, mangling with a postinst on the staging package.
        • [+] Make base.bbclass pull in staging.bbclass
        • Make packaged staging required rather than optional and clean things up
        • [+] Move the non-packaged aspects of staging into staging.bbclass
        • Populate the RDEPENDS of the pstage packages as appropriate, using the same rdepends generation mechanisms that the packaging process uses in order to establish runtime dependency between pstage packages.
        • [+] Rename packaged-staging.bbclass to staging.bbclass
        • Rename the staging packages, as they are now capturing the full output of the build, not exclusively staging bits.
      • [-] Revamp all of staging population & packaging
        • [+] Add a new staging task before configure which gathers up the do_install output from the recipes it depends on. Might need to enhance BitBake to provide extra information, or emit extra information into the project area.
        • [+] Add the mangling bits as callbacks, in a .conf/.inc, so it's global, and will be called by the new staging task.
        • [+] Capture the output of a recipe's do_install.
        • [+] Remove all staging bits, archiving the staging mangling bits, as those will be needed.
        • Ugh, forgot about rdepends, should I scrap it and just use ipk instead of the current archive format, or come up with something of our own?
  • Recipes
    • keyutils
      • Appears to have a PARALLEL_MAKE failure.
    • libpcap
      • Add an autoconf enable argument to toggle bluetooth sniffing
      • Add an autoconf enable argument to toggle usb sniffing
        • Kill the udevinfo call for cross
  • [-] Research
    • Can we LD_PRELOAD sandbox our builds to prevent certain accesses / decisions by autoconf based on aspects of the build machine?
    • Can we use autoconf tracing to get a list of with/enable configure options, and compare against EXTRA_OECONF?
    • Configuration Signature Generation
      • Fix it so the signatures only care about the final result, not how it got there

        • Specifically;

        FOO = "bar" BAR = "${FOO}"

        should probably have the exact same signature for the BAR variable as:

        BAR = "bar"

      • Create bb.data wrapper that tracks one of the following: This wrapper would facilitate more accurate signatures by seeing exactly what variables flow into the tasks. We could generate a signature based on the tasks and exported vars, and their references, and that would be nearly sufficient.

        • Variable accesses by tracing getVar calls, the way the cache does, apparently
        • Variable references by parsing the vars
      • Remove signature from pstage package filename, instead storing the signature as a pickled python set inside of the pstage files. This could result in unnecessary downloading of prebuilts, but could also allow us to notify the user / log changes to the metadata resulting in signature changes, and it would also allow us to ensure that adding a variable to the blacklist would result in ignoring that variable in both the old and the new signature, not just the new as it stands today.

    • Create a tool to extract all the python code from the metadata, to run pyflakes/pylint/clonedigger/etc against it.
    • Determine how & where OE_LT_RPATH_ALLOW is used.
    • Determine how to capture changes to a directory before and after an operation, without using an LD_PRELOAD library to monitor.
      • Git? (note: the dir we're monitoring may already be a git repo)
        • alternatively, a separate GIT_DIR
        • git-stash has a snippet that may be useful
    • Determine if an sqlite database would be more appropriate for the emitted metadata used in the abiversions class.
    • Determine if it's possible to change bb.fetch to return the local path in the fetch operation rather than via localpath methods. Would likely need to leverage some sort of persistent datastore to capture the local paths for each recipe.
    • Determine if we want to support truly native builds, across everything, leveraging a global BBCLASSEXTEND.
    • Do we want include_next/require_next? Phil argues against it.
    • File Format / Parsing
      • Can we change the way we capture the lines into an ast / semantic model to facilitate moving the distro/machine .conf metadata before the local.conf metadata?
      • Find incremental steps to clear up the declarative vs imperative confusion. I'd like to try to capture which recipes -rely- on the fact that inherit occurs immediately, and attempt to implement deferred inclusion, placing the classes in a layer between the configuration metadata and the recipe metadata.
        • Add task to diff the current recipe metadata against the stored version.
        • Add task to store a recipe's metadata.
        • Modify BitBake to make inherit directives lazy.
      • Look into ways to avoid include/require in recipes, find a better mechanism of sharing the common metadata amongst different versions of the upstream project.
      • Should we split the descriptive metadata from the executable metadata? I never wanted to, but it's something we may want to revisit.
      • What about means of interacting with other sources of metadata, like freshmeat.net?
      • What about ways of extracting/storing the metadata other than flat file?
      • What would it take to implement pure python bbclasses, perhaps distributed as eggs?
    • Find out if the user base would prefer a flexible image creation tool rather than using image recipes.
    • How capable is Poky's functionality for creating full disk images with partition tables?
    • How much work would it be to build a standard chroot/VM/livecd/etc to do builds in, which everyone could use?
    • [+] Investigate the C implementation of update-alternatives included in chkconfig, as an alternative to cworth's shell incarnation, or the busybox one, or the debian perl one.
    • [+] Is it viable to implement type checking for configuration variables? Create a proof of concept.
    • Is it viable to only regenerate autotools bits if the dependent macros, etc change?
    • Look into -Wl,-as-needed usage Richard says Poky does this.
    • Look into alternative mechanisms for controlling/configuring BitBake. A more project based approach as opposed to the BBPATH/BBFILES mechanisms, for example.
    • Look into making some more projects support enforcing symbol visibility
    • Look into making some more projects use binreloc
    • Look into more general usage of $ORIGIN
    • Look into something like the Ksplice Uptrack Manager for embedded systems

OpenEmbedded

  • Tips
    • Non-exported variables that aren't referenced by other vars will never be expanded.
    • Anonymous python functions have no guarantees on how often they'll be run. Currently they're run both at initial parse time and every time a full parse is done, which is at every task execution.
    • If collection Alpha has foo_1.0.bb and collection Beta has foo_2.0.bb, and Alpha is higher priority than Beta, it will use foo_1.0.bb, even though foo_2.0.bb is newer.
    • Make a foo.bb and a bar.bb. Make bar.bb PROVIDES += "foo". There's now no way to avoid its preference to build foo for foo, as it doesn't obey DEFAULT_PREFERENCE in either direction, it'll always build foo.bb unless you specify a PREFERRED_PROVIDER.
  • Market Analysis
    • User Base
      • Comments from Thomas Petazzoni's presentation
        • "Like a desktop distribution", not easy to simplify things, stuck with the sysvinit based startup, stuck with the postinsts from all the packages being run, rather than manually doing so (this one is incorrect, a combination of ONLINE_PACKAGE_MANAGEMENT="none" and ROOTFS_POSTPROCESS_COMMAND could get you there)
      • Cons according to Thomas Petazzoni of Free Electrons
        • No stable releases
        • Steep learning curve
        • Very slow to run
        • Too generic, huge boot times
        • Packages mandatory This is BS, though. We can construct root filesystems using packages, but not actually put any packaging bits into the image, courtesy ONLINE_PACKAGE_MANAGEMENT.
    • Competition
      • Embedded Build Tools To Test Out
        • repo & android's buildsystem
          • How dependent upon gerritt is repo?
            • Can it run without it?
            • Would ReviewBoard integration be possible?
        • Crosstool-ng
        • LTIB
        • PTXDist
        • e2factory
        • Firmware Linux Has a qemu based approach, similar to what scratchbox does, only it runs under system, not application mode.
          • This approach gives us the reliability, since configure scripts can't pick up aspects of the host machine.
          • Fully bootstrapped. We could get around this by using OpenEmbedded to build a qemu, vmware, or live cd to do OE development in, perhaps.
  • Issues
    • Darwin Issues
      • mktemp usage
      • m4-native issues
      • curl fetcher
      • sed-native deps for packages that need gnu's sed
    • [worked around] do_rebuild is broken in recent bitbake versions 62a9e5a33343ba4e6393e06bcc6c711f86418ccb is the cleanup commit that breaks do_rebuild. It removes the graph code from build.py, in a cleanup, avoiding duplication, but resulting in bb.build.exec_task no longer obeying dependencies, so the do_rebuild task can't use exec_task to run clean, then BB_DEFAULT_TASK.
  • Design Ideas
    • Make it a general purpose enterprise level build tool
    • Use .NET Makes it easy to break things up into assemblies, and write components in whatever language the user prefers
    • Don't write classes in the recipe metadata format Probably make them into .NET assemblies. They have a specific defined API, aren't in the same format as the recipes themselves, and can be written in any .NET language the user would prefer to use
    • There must be versioning of recipe classes
    • Should use some sort of versioning of the file format, to handle incremental improvements to it?
  • Desired Features
    • Continuous Integration
    • Scheduled Builds
    • On Demand Builds
    • No Fetching Necessary
    • Metadata in the source tree
    • Multiple SCM support
    • No manual recipe editing required
    • Web interface
    • Console interface(s)
    • GUI interface(s)
    • Deployment tools/capability (packaging, image creation, etc)
    • Recipe classes
  • Requirements

OE Issues IRC Log

[10:57am]pb_:kergoth: I think the main problem with understanding the oe repository is the weird mixture of apparently static, declarative things and dynamic, mutable variables, coupled with the fact that the processing order for various things is a little bit unpredictable without knowing all the details of how bitbake operates. [10:57am] Sup3rkiddo left the chat room. ("Leaving") [10:57am]pb_:kergoth: for example, the long-standing thing where "FILES_${PN}" and "FILES_foo" are not exactly synonyms for package foo, although it appears at first glance like they ought to be. [10:57am]sweetlilmre:zecke: I tried a -f -c build, still nothing [10:58am]sweetlilmre:unless you meant -f -c compile? [10:58am]zecke:sweetlilmre: -f -c compile   [10:58am]sweetlilmre: [10:58am]pb_:kergoth: in fact, I guess the way that overrides operate are at the root of a lot of the misunderstandings. [10:58am]hubar:Is there anyway to use the packages I already downloaded inside my source directory instead of redownloading them? [10:59am]woglinde:hubar???? [10:59am]sweetlilmre:hubar: DL_DIR? [11:00am]hubar:hmm? [11:00am]hubar:let me try DL_DIR  [11:00am]woglinde:you have one in your local.conf [11:00am]woglinde:and thats where all downloaded files are put [11:00am]sweetlilmre:hubar: the only reason I can think of that it is redownloading them is that OE conf is looking for a different dir, which is specified by DL_DIR [11:00am]florian:bbiab [11:01am] florian left the chat room. (Remote closed the connection) [11:01am]kergoth:pb_: that's a good point.  overrides was always a bit odd, both syntactically and when it affects things.  prepend/append's evaluation time was a part of all that mess too.  it's pretty obvious i started going from a gmake like syntax, not exactly one thing or another [11:01am]sweetlilmre:hubar: default to TMPDIR/download afaik [11:01am]kergoth:ah well [11:01am]hubar:so you mean OE won't try to download new files if they are already downloaded? [11:01am]•kergoth jots down pb's comments for future reference in case he comes across the necessary motivation to do something about it [11:01am]sweetlilmre:hubar: yep, if hashes match etc. [11:02am]sweetlilmre:hubar: warning I am a n00b here..  [11:02am]hubar:alright!  [11:02am]hubar:I am a bigger n00b then you  [11:02am]pb:kergoth: right, the prepend/append thing is also a fertile source of confusion, though luckily oe doesn't use it much anymore. [11:02am]•kergoth nods [11:02am]sweetlilmre:hubar: which install guide did you follow? [11:03am]woglinde:pb hm what to use instead? [11:03am]pb:if I was designing a putative "oe 2.0" kind of format then I think I would have conditional declarations be directly supported in the metadata [11:03am]hubar:sweetlilmre: wiki. [11:03am]thesing:pb: I thing in some cases one has to use append because = doesn't work. [11:03am]sweetlilmre:hubar: err, yeah, but which page, and what is your target system? [11:04am]kergoth:pb: i think it was a mistake to let 'inherit' be order dependent from the start.  it acted too much like include [11:04am]pb_:thesing: right, but that is just another symptom of the underlying problem. [11:04am] lrg_ left the chat room. (Read error: 110 (Connection timed out)) [11:04am]pb_:kergoth: yah, exactly [11:04am]kergoth:to point out a specific example of that sort of format confusion.. are we declarative or not [11:05am]pb_:quite. [11:06am]hubar:sweetlilmre: http://wiki.openembedded.net/index.php/Getting_Started [11:06am]hubar:sweetlilmre: targetting system is x86. [11:06am]thesing:right. This gave me some headache. [11:07am]thesing:I first thought that order of declaration doesn't matter in bbfiles. [11:07am] Soopaman left the chat room. (Connection timed out) [11:07am]woglinde:thesing nope [11:07am]woglinde:inherit it matters [11:08am] Namapoos left the chat room. (Connection reset by peer) [11:08am] NAiL is now known as AwayNAiL. [11:08am] lastik_ joined the chat room. [11:08am] mithro left the chat room. (Read error: 110 (Connection timed out)) [11:09am]thesing:yes. if you want to declare sth. that is used by the class you inherit you have to declare it before the inherit stmt. [11:09am] AwayNAiL is now known as NAiL. [11:09am]woglinde:thesing not only that [11:10am]woglinde:if you inhert foo1 foo2 [11:10am] marex joined the chat room. [11:10am]woglinde:and you declare the same variable in both [11:10am]woglinde:the last one is taken [11:10am]kergoth:depends on when the class's functionality is evaluated, which is what pb was pointing out, processing order ..it might not need the variables until task executuion, or it might be used in an immediately exapnded variable or anonymous function [11:10am]cbrake:what LICENSE is this: http://pastebin.ca/1232999 [11:10am]cbrake:wondering what to put in the OE recipe [11:11am] Namapoos joined the chat room. [11:11am]kergoth:smells like MIT or X11, but i'm rusty [11:11am]kergoth:check the osi page [11:11am]cbrake:kergoth: ok, thanks [11:11am]woglinde:hm hplip package [11:11am]cbrake:woglinde: yup [11:11am]cbrake:woglinde: you ever work with it? [11:12am]woglinde:cbrake http://hplipopensource.com/node/296

TODO - Other Yocto and OE

Ubuntu VM/chroot/container notes on setup: libc6-dev-i386 (for pseudo, so we can load it with the 32-bit external toolchain binaries) git python ssh (I use ssh auth for the private git repositories)

From sanity:
    make
    gcc
    chrpath
    g++
    patch
    diffstat
    texinfo
    gawk
    wget
    In 14.04:
        perl-modules libdata-dumper-simple-perl
    In 12.04:
        perl-modules

CodeBench installer needs: unzip

TODO: uninstall libc6-i386 and make sure the errors from meta-sourcery are as clean and straightforward as possible. E.g. store details on the external toolchain binary execution failures and then display them before we exit, but don't display them at parse time to avoid spamming the user.

Bugs
  • Open a yocto bug for the cross-recipe eventhandler leakage, if it still exists, see https://gist.github.com/kergoth/11460892
  • Review and submit missing python dep fixes from https://gist.github.com/kergoth/7d2e9f6c7d0fd759d7f2
  • libx11 is pulled into a console-image by alsa, pulseaudio, consolekit (by pulseaudio-module-console-kit), and dbus, but x11 in dbus is used for autolaunch stuff, which we might want
    • gtk+ config in alsa-tools
    • x11 config in dbus
    • x11 config in pulseaudio
  • rpcbind service is enabled, but inactive/dead, not being started on boot
  • In meta-sourcery, add a postprocess step to do_install_ptest_base to fixup ownership on ${D}${PTEST_BASE}, since we bypass pseudo for our gcc binary for the cs-license issue, and most ptest installation uses cp/tar to copy parts of the source tree.
  • If you have a local git mirror tarball, but no local clone, and MIRRORS and PREMIRRORS are unset, the fetch will fail, as there's no logic to check for the mirror tarball already existing, lacking a mirror fetch method to call. Might be able to adjust the return value of localpath() the way I do for update, but that could have complications
  • codeparser doesn't handle non-call name lookups to the metadata

Features / Improvements

  • Add a feature for image.bbclass which limits the number of previous files/images to keep around. E.g. current+3. This would be particularly helpful when building 4gb sd card images with wic, for example. We do have the ability to remove all old images via RM_OLD_IMAGE, iirc, but a count would be more flexible.
  • If the license priority bits never went upstream, see about doing so. https://gist.github.com/kergoth/1590028
  • Improve bitbake function compilation/evaluation/execution - https://gist.github.com/kergoth/743677
  • Submit portable implementation of dirs/pushd/popd from https://gist.github.com/kergoth/5044367 as utils in oe-core for use in the metadata. The implementation uses space separator, so will break in paths with spaces, but we break there anyway.
  • Add support for task-level exports to bitbake - ala https://gist.github.com/kergoth/8410245 but in bitbake itself
  • Submit prepare_sources to upstream Yocto - https://gist.github.com/kergoth/9d7d31999db83ab045db
  • Add default PACKAGECONFIG value to base.bbclass from DISTRO_FEATURES, with an intermediate variable for the recipe to use if they want to extend the default
  • Is the packageconfig-backdel prototype approach worthwhile, or would it be better to have a standalone tool which emits PACKAGECONFIG_remove_pn-* in a .conf which can be included, or modified further? Is there any other obvious low hanging fruit wrt items to add to CONSIDERED? E.g. xz is probably worthwhile across the board. tcp-wrappers should really obey a distro feature in all cases, but this isn't the case at the moment
  • Leverage bits from populate_sdk_ext for archive-release
    • Nothing from populate_sdk_ext.bbclass will be of use to us, as it seems to be all specific to how they want the ext sdk constructed, but oe.copy_buildsystem and the scripts it calls will be of use.
    • Make use of oe.copy_buildsystem.{generate_locked_sigs,prune_lockedsigs,create_locked_sstate_cache} to populate the sstate cache for archive-release. We won't be shipping the locked sigs configuration, since our users are expected to be doing more with the product, but we can use a temporary file for it just to populate the sstate cache.
    • Look into the bitbake/layer handling. I checked into oe.copy_buildsystem.BuildSystem.copy_bitbake_and_layers, but it's of limited usefulness to us since we want independent tarballs for each component, rather than a monolithic archive. Still, could be worth using if we then archive up each file/dir in the directory it populates. One potential issue is, it moves non-corebase-relative layers to the root, so layers like meta-networking will likely end up in the root of the sdk rather than retaining their relative path from ${COREBASE}/... We should confirm that behavior, though.
  • Add wildcard support to blacklist.bbclass
  • Fix issues in meta-named-configs-prototype for adit
  • Do we want to use the binary variant prototype? Is it kosher to ship the source recipe even though it can't be built without the sources which we don't ship? See https://gist.github.com/kergoth/f434af71e9b8d442e0f0. IIRC the prototype had some issues, in that we don't want the binary-only state activated on the main recipe, otherwise it'll affect the source-to-binary variant as well. We'll want two variants, one for binary, one for source-to-binary, with the main recipe marked as unbuildable, or not default.
  • Should I do anything with the pkgconf recipe bits? See amor:/scratch/cedar/mel-tiny/meta-kergoth-wip
  • Submit bitbake UI modification to keep recipe+task name prefixed in all log messages.
  • Longer term: fixup .pc files to work 100% with pkgconf-native rather than pkgconfig-native. We want .pc files to leverage Requires rather than adding deps bits to Cflags/Libs, and ideally also want .pc files using Requires.private most of the time rather than Requires. It'd be interesting to add a little code to default the latter in .pc files which only specify Requires and not Requires.private, with a whitelist for cases we know are handled sanely.
  • Look into what's needed to avoid nostamp with externalsrc builds. Presumably capturing a proper git version from the externalsrc git repository would do, but it'd need to know to invalidate caches when HEAD changes, much as my old srctree+gitver did.
  • Consider implementing external-toolchain.bbclass using the fetcher's localpath handling to locate the files, but with a custom unpack. We'd need to handle missing files differently, since we don't want errors at parse time for missing files, so we need our own variables, not SRC_URI, FILESPATH, or PREMIRRORS. I could alter the metadata to redirect WORKDIR to B or some other subdir of WORKDIR, so the files all get put in an isolated location which is then installed to ${D}. See https://gist.github.com/kergoth/1fd4bfcf46045305d259 for reference. That approach would have some issues, as the fetcher resolves symlinks, etc.

Misc: kergoth: that would be good thanks! I think in the main part I'm really struggling with are the intermittent "random" autobuilder failures we keep seeing. If you fancy picking any of those up you'd be welcome. Other ideas are some of the less loved parts of the system like systemd, or some of the general code "infrastructure" improvements we keep talking about

TODO - Yocto & OE

SDK

  • kergoth: I played around with rewriting the SDK to be a self extracting Python file instead of a shell script (see poky-contrib/jpew/pyz-sdk)

Recipetool

  • Handle extralines for oe.recipeutils.bbappend_recipe()
  • Avoid SRC_URI duplication in bbappend_recipe
  • Write unit tests (oe-selftest) for create_buildsys_python

Devtool

	- [ ] Review the code to the osssystems setup scripts, specifically the hook implementation
		http://doc.ossystems.com.br/managing-platforms.html#__code_setup_environment_code_hooks
		http://code.ossystems.com.br/gitweb?p=ossystems-yocto-base-scripts.git;a=blob;f=setup_environment_internal.py;h=d9e43210aded9d1ada403b71bd724b0a47af3604;hb=44bef26947245468028d79f45448a6a092ecbc54
		- [ ] Add thoughts and review notes to https://docs.google.com/document/d/1nso0BcluDHJLSFfbYqj2eV0eA6-ElKusPz7XhrMmPxM/edit?ts=5717e8d4
	- [ ] Complete the source-preparation-hook for oe-core, then prep it for submission
		https://bugzilla.yoctoproject.org/show_bug.cgi?id=6372
		- [ ] devtool source extraction currently has hardcoded hooks for the source prep for the kernel, and no real way to extend what it does when extracting the sources. Discuss this with Paul. I also think we should add the ability for the metadata to inject extra externalsrc logic, the way devtool does the do_configure_append for .config for kernel recipes
			- [ ] Open a Yocto bug for this so we don't forget about it
			- [ ] 1. Do the source preparation in two phases. Step one is prior to writing the bbappend and using externalsrc, and after it parses the recipe with externalsrc and examines SRCTREECOVEREDTASKS. If any tasks are listed there which we haven't run already, run them all, and tag devtool-prepared. To do this, we'd need kernel-yocto to append to SRCTREECOVEREDTASKS itself, rather than having devtool do it, which I think may be reasonable
			- [ ] 2. Make prepare_sources a 'the sources are completely prepared at this point' task (similar to do_image_complete), make it always exist, just not always hooked into the task graph automatically, then devtool explicitly builds that task, and in the case where the hook isn't defined by the recipe, it just unpacks+patches. Then kernel.bbclass would make prepare-sources depend on its source preparation tasks
	- [ ] Remove usages of addtask (and possibly also prefuncs/postfuncs) from recipes in favor of explicitly provided hooks
		- [ ] populate_packages for the dynamic package emission? This hook should be a PACKAGEFUNC
	- [ ] To Prototype
		- [ ] Limit metadata access to undeclared vardeps
	- [ ] https://gist.github.com/kergoth/9748594
	- [ ] https://gist.github.com/kergoth/8424665
	- [ ] Consider adding static build support
- Bugs:
		- [ ] I just modified meta-environment and the ade wasn't reconstructed?
		- [ ] if i bitbake -c clean systemtap; bitbake -C configure systemtap, bitbake re-runs 149 setscenes. packagedata for gcc-runtime-external, etc
		- [ ] After a non-externalsrc build, then devtool modify -x, before devtool build: WARNING: Unable to get checksum for cups SRC_URI entry configure.sstate: [Errno 2] No such file or directory: '/scratch/dogwood/cedar-merge/build/tmp/work-shared/cups/2.1.3-r0/configure.sstate'
		- [ ] externalsrc combined with -C fetch / -C configure seems to result in an attempt to re-fetch/unpack
		- [ ] Open an issue for the qemu user mode failure on intel-corei7-64 in a vmware vm (labmanager)
		- [ ] SDK_OS seems to be missing LIBCEXTENSION and possibly also ABIEXTENSION. I'm not sure anyone cares, unless they want to support changing the libc of nativesdk. I do care about this, so will likely prepare a patch series.
		- [ ] recipetool newappend -w -e meta-crosssdk-suffix nativesdk-libgcc
			Parsing recipes..done.
			ERROR: Unable to determine layer directory containing virtual:nativesdk:/scratch/yocto-new/crosssdk-suffix/poky/meta/recipes-devtools/gcc/libgcc_5.3.bb
		- [ ] phonet-utils shouldn't inherit autotools-brokensep, it's not an autotools recipe
		- [ ] bug in devtool update-recipe. if S is a subdir of the upstream git repository, the commit will have paths relative to the root, not relative to S, so the patch it generates won't apply
  - wic:
			- [ ] No ability to set an initrd/initramfs for an efi disk image, only for iso hybrid images
	- [ ] We need a python API to to remove _remove overrides
	- [ ] Best practice in recipes for PACKAGECONFIG should be to use =, not ?= nor ??=, otherwise the user setting 'PACKAGECONFIG' unknowingly in a local.conf will break the universe. Related: warn if it's set without overrides in the configuration metadata.
	- [ ] Review missing python runtime dependencies
		https://gist.github.com/kergoth/7d2e9f6c7d0fd759d7f2#file-check-python-deps-txt
	- [ ] Consider applying Angstrom's license MACHINE_ARCH separation to mel.conf
		https://github.com/Angstrom-distribution/meta-angstrom/commit/dae094d69a1f8651c822fb35d606bf59a615f244
	- [ ] Improve licensing:
		- [ ] Enhance oe.license.flattened_licenses() to kill duplicates
			- [ ] ex. GPLv2 & LGPLv2 & GPLv2 & LGPLv2.1 -> GPLv2 & LGPLv2 & LGPLv2.1
		- [ ] Priority sorted/filtered licensing
			https://gist.github.com/kergoth/1590028
			- [ ] Use this license flattening for INCOMPATIBLE_LICENSE
			- [ ] Use this license flattening in the packaging process for the package LICENSEs
			- [ ] Use this for the license manifest, so as to not confuse the user
			- [ ] Implement expansion of '+', obeying these preferences
			- [ ] Implement license-filtered sstate archive distribution for sstate mirror population without licensing issues
	- [ ] Investigate ReproducibleBuilds
		https://wiki.debian.org/ReproducibleBuilds
	- [x] bb:
		- [x] Setup scripts, leveraging sh- prefixed subcommands
- bitbake:
		- [ ] Talk to Richard about what I can do to help take on some of his load / reduce the bottleneck on him for bitbake and core class changes. Paul is good about reviewing some of that, Saul and Ross both help with the oe-core changes, but for bitbake internal changes Richard is still a bottleneck, and it's affecting us as well as others. Perhaps I could become a lieutenant for bitbake, reviewing patches, testing, and handing off consolidated bitbake requests? Food for thought.
		- [ ] bitbake -S doesn't error on invalid/unimplemented arguments, it just returns after doing nothing with success. I typoed -C as -S and didn't notice at first.
		- [ ] Add a new runqueue for clean tasks to run before the regular tasks. This was mentioned by Richard on IRC, and it does seem like a nice, clean way to make it so we can run both clean and non-clean tasks in the same build. Ideally, we'd not just handle clean on the command-line, but also if pulled in via dependencies, but to do that we'd have to break apart the dependency graph.
		- [ ] Afaict only *tasks* run prefuncs/postfuncs, yet the dependency code includes prefuncs/postfuncs in function checksums, not only tasks
		- [ ] Another nice thing would be a -S argument for dumping the signature of a recipe+task rather than having to -S none + bitbake-dumpsig the correct path
		- [ ] I'd love a -S argument like printdiff, but which is solely limited to stamps and doesn't poke around SSTATE_DIR at all
		- [ ] Allow metadata to exert control over the shell script setup in build.py. Trap setup, set -e, ..
		- [ ] Would be nice to add color to the recipe/task log message prefix
		- [ ] Consider splitting out runqueue execution into a standalone tool, i.e. emit scripts and a build.ninja and let ninja do the task execution by running the scripts
		- [ ] Rather than only setting the colorizing log handlers on the fds we know are ttys, we could instead (or in addition..) filter out any color codes for fds that aren't. This would let callers pass along their own colored messages which would adapt to where they're going, rather than only being able to color the NOTE/WARNING/ERROR/DEBUG prefix. Alternatively, use templating and empty the values when it can't be used, i.e. {blue} or %(blue)s.
		- [ ] Now that *task* messages are prefixed, we should inject recipe/task info into every log message in an appropriate context, so we can also prefix e.g. messages coming from event handlers and anonymous python with at least the recipe name / version.
  - would-be-nice:
			- [ ] Deprecate __anonymous, initially adding a warning. RP acked this.
			- [ ] Deprecate the SkipPackage exception in favor of a variable which instructs bitbake that the recipe cannot be used, with a reason. E.g. BB_SKIPPED. This would allow one to use bitbake -b -e with a skipped recipe, for example.
				Richard points out that this could see problems with recipes which are skipped because they're unparseable in that context. This is true, however I'd like to see such recipes fixed to always be parsable, just not always usable.
			- [ ] Implement bitbake-internal support for variable typing
			- [ ] Add a feature which blocks access to variables which bitbake doesn't know about as vardeps, for both variable expansion and executed tasks.
- meta-sourcery:
		- [ ] Attempt to support musl external toolchains in meta-sourcery. Specifically it may be worth testing it with the static aboriginal musl cross-compiler toolchains.
		- [ ] Attempt to extract TUNEABI_WHITELIST and TUNEABI_<abi> information using gcc -print-multi-lib and potentially examine the ELF sections to try to extract information about the baseline default multilib configuration
		- [ ] The use of --no-sysroot-suffix is problematic when dealing with older Sourcery toolchains, like the 4.5 lite, as apparently the argument hasn't always been available. Investigate further, decide whether to revert to go back to the symlink method, drop support for older toolchains on the master branch, or change behavior conditionally on gcc version.
		- [ ] Examine meta-linaro (http://git.linaro.org/?p=openembedded/meta-linaro.git;a=tree;f=meta-linaro-toolchain;hb=master) for ideas or useful bits
		- [ ] Create a recipe which creates toolchain wrappers, the way seebs was doing for Wind River, as a convenience
	- [ ] buildhistory-diff should probably diff the install scripts (pre/post inst/rm)
- To Consider:
		- [ ] Change the default path prefix for native/cross to an invalid location. This would ensure that any relocation issue is caught as soon as it's used, regardless of whether that's from scratch or from sstate. The way it is today, non-sstate can work but sstate not depending on the path it's installed to, since the path prefix is valid based on where it was built. We'd have to alter the sysroot fixups to also replace the invalid path, though, otherwise the adjustment to correct path wouldn't be done.
- Long Term / Would Be Nice:
		- [ ] If possible, consider a nicer mechanism for credential handling. I.e. if we could securely pass the credentials into the underlying fetcher process, we could use the python keyring module which can use the gnome keyring, vault, etc to retrieve credentials. I.e. for git we could provide our own credential helper. I think there'd be definite value in this. If it needs to prompt the user, it'd happen at parse time, not during the build, so I don't think that would be a problem, and there are multiple appropriate keyring backends one could use on headless systems. I.e. headless gnome keyring over dbus, vault, etc.

TODO: uninstall libc6-i386 and make sure the errors from meta-sourcery are as clean and straightforward as possible. E.g. store details on the external toolchain binary execution failures and then display them before we exit, but don't display them at parse time to avoid spamming the user.

Bugs
  • The blktrace ldflags patch needs iowatcher added after the version bump, similar to eb55720 was for btrecord
  • Open a yocto bug for the cross-recipe eventhandler leakage, if it still exists, see https://gist.github.com/kergoth/11460892
  • Review and submit missing python dep fixes from https://gist.github.com/kergoth/7d2e9f6c7d0fd759d7f2
  • LAYERDIR is used as is in BBFILE_PATTERN, not regex escaped. https://gist.github.com/kergoth/be105473eb69b55efd33
  • libx11 is pulled into a console-image by alsa, pulseaudio, consolekit (by pulseaudio-module-console-kit), and dbus, but x11 in dbus is used for autolaunch stuff, which we might want
    • gtk+ config in alsa-tools
    • x11 config in dbus
    • x11 config in pulseaudio
  • rpcbind service is enabled, but inactive/dead, not being started on boot
  • [.] bb.fetch issues
    • If you have a local git mirror tarball, but no local clone, and MIRRORS and PREMIRRORS are unset, the fetch will fail, as there's no logic to check for the mirror tarball already existing, lacking a mirror fetch method to call. Might be able to adjust the return value of localpath() the way I do for update, but that could have complications
  • codeparser doesn't handle non-call name lookups to the metadata
Features / Improvements
  • If the license priority bits never went upstream, see about doing so. https://gist.github.com/kergoth/1590028
  • Improve bitbake function compilation - https://gist.github.com/kergoth/743677
  • Submit portable implementation of dirs/pushd/popd from https://gist.github.com/kergoth/5044367 as utils in oe-core for use in the metadata. The implementation uses space separator, so will break in paths with spaces, but we break there anyway.
  • Submit prepare_sources to upstream Yocto - https://gist.github.com/kergoth/9d7d31999db83ab045db
  • Add default PACKAGECONFIG value to base.bbclass from DISTRO_FEATURES, with an intermediate variable for the recipe to use if they want to extend the default
  • Is the packageconfig-backdel (https://gist.github.com/kergoth/2c556d9e34cbcd03cb82) prototype approach worthwhile, or would it be better to have a standalone tool which emits PACKAGECONFIG_remove_pn-* in a .conf which can be included, or modified further? Is there any other obvious low hanging fruit wrt items to add to CONSIDERED? E.g. xz is probably worthwhile across the board. tcp-wrappers should really obey a distro feature in all cases, but this isn't the case at the moment
  • oe-selftest should use imp.get_suffixes() rather than hardcoding .py
  • Determine if we can implement the sstate distro fallback handling for natives via direct alteration of SSTATE_MIRRORS, rather than through SSTATE_MIRROR_SITES. The latter is a nice feature on its own, but it'd be nice if the two weren't tied together.
  • Look into how do_populate_sdk_ext works, specifically for downloads and sstate gathering, and leverage that in archive-release, or even rework archive-release to be a task at the image level the way that one is, sharing most of the same code
  • Leverage bits from populate_sdk_ext for archive-release
    • Nothing from populate_sdk_ext.bbclass will be of use to us, as it seems to be all specific to how they want the ext sdk constructed, but oe.copy_buildsystem and the scripts it calls will be of use.
    • Make use of oe.copy_buildsystem.{generate_locked_sigs,prune_lockedsigs,create_locked_sstate_cache} to populate the sstate cache for archive-release. We won't be shipping the locked sigs configuration, since our users are expected to be doing more with the product, but we can use a temporary file for it just to populate the sstate cache.
    • Look into the bitbake/layer handling. I checked into oe.copy_buildsystem.BuildSystem.copy_bitbake_and_layers, but it's of limited usefulness to us since we want independent tarballs for each component, rather than a monolithic archive. Still, could be worth using if we then archive up each file/dir in the directory it populates. One potential issue is, it moves non-corebase-relative layers to the root, so layers like meta-networking will likely end up in the root of the sdk rather than retaining their relative path from ${COREBASE}/..
  • Add wildcard support to blacklist.bbclass
  • Determine if sstate lockdown is capable enough for adit. Initial attempt, it looks like it simply prevents changes, erroring out the build if the metadata changes. What we're looking for is a way to force something to not rebuild even if the metadata changes, or to let it rebuild, but not have its dependencies rebuild. We might need to customize the class.
  • Fix issues in meta-named-configs-prototype for adit
  • Do we want to use the binary variant prototype? Is it kosher to ship the source recipe even though it can't be built without the sources which we don't ship? See https://gist.github.com/kergoth/f434af71e9b8d442e0f0. IIRC the prototype had some issues, in that we don't want the binary-only state activated on the main recipe, otherwise it'll affect the source-to-binary variant as well. We'll want two variants, one for binary, one for source-to-binary, with the main recipe marked as unbuildable, or not default.
  • Should I do anything with the pkgconf recipe bits?
  • Longer term: fixup .pc files to work 100% with pkgconf-native rather than pkgconfig-native. We want .pc files to leverage Requires rather than adding deps bits to Cflags/Libs, and ideally also want .pc files using Requires.private most of the time rather than Requires. It'd be interesting to add a little code to default the latter in .pc files which only specify Requires and not Requires.private, with a whitelist for cases we know are handled sanely. Note, latest HEAD of pkgconf supports PKG_CONFIG_SYSTEM_{INCLUDE,LIBRARY}_PATH.
  • Look into what's needed to avoid nostamp with externalsrc builds. Presumably capturing a proper git version from the externalsrc git repository would do, but it'd need to know to invalidate caches when HEAD changes, much as my old srctree+gitver did.
  • Consider implementing external-toolchain.bbclass using the fetcher's localpath handling to locate the files, but with a custom unpack. We'd need to handle missing files differently, since we don't want errors at parse time for missing files, so we need our own variables, not SRC_URI, FILESPATH, or PREMIRRORS. I could alter the metadata to redirect WORKDIR to B or some other subdir of WORKDIR, so the files all get put in an isolated location which is then installed to ${D}. See https://gist.github.com/kergoth/1fd4bfcf46045305d259 for reference.

OpenEmbedded Thoughts - No fetch, recipe level scheduling, script emission

Let's picture a directory which contains git clones of many open source projects, all the projects you need to build a basic filesystem. Further, let's say that we have bitbake recipes available for all of these as well. Let's also say that we have one or more virtual machines or boards for our target architecture available, with minimal root filesystems. We presume that we also have a network filesystem to ensure file availability amongst the machines.

Now, let's imagine a tool which parses these bitbake recipes and produces a recipe level (not task level) job queue. It finds the recipes which can currently be built (those with no dependencies), and for each, emits a script which will build it to completion, including packaging. Then, this tool either directly dispatches these jobs to the worker machines, or uses an existing job scheduling tool to do it. When a job completes, it captures the binary packages from it, adds them to a package feed, and notes that this dependency is satisfied. Then it continues, identifying newly buildable recipes and dispatching to workers to build them one by one. Each job constructs its own sysroot with the package manager, from the aforementioned package feed, so there's no risk of contamination.

A build of anything with this tool simply results in populated package feeds, which can then be used by either our own image/rootfs construction tools, or it could dispatch a job to a worker which runs debootstrap/rpmbootstrap/etc to construct it instead.

By avoiding fetch/unpack/patch, we eliminate the main requirement for task level scheduling.

We could leverage existing contents of the worker machines to produce the base packages. For example, we could use aboriginal linux to produce the virtual machine, then do something similar to our external toolchain recipes to package up the existing contents of that filesystem, to avoid the need to rebuild glibc & co.

It's theoretically possible that we could leverage project-builder.org to do the project builds, but the problem is, they operate against the host rootfs, not a sysroot.

Docs & Training

  • What it is and how it works
    • Basic history and background
      • Original requirements
        • Support cross-compilation
        • Self contained
        • Build on any arbitrary build distribution
        • Target any arbitrary target distribution and hardware, to facilitate metadata sharing
        • Support emission of a disk image, kernel, etc to run on actual hardware
        • Support emission of binary packages for a package feed
        • Ideally, allow us to physically separate metadata used by a distro vs machine vs ..
        • File format
          • Directly parsed, not a shell script
          • Ideally not limited to just shell for extensibility
          • Key=value pairs, plus information about that metadata
          • Support some form of inheritance as a means of abstracting out handling of common elements
    • Core concepts
      • Metadata driven, key=value store + flags, "everything" is metadata, including tasks
      • Executes tasks in a specified order to produce some output artifacts
      • Context, lack of real namespaces, global vs recipe metadata and events, global flows into recipes
      • Layering within the metadata, config file load order and OVERRIDES
      • Maximum flexibility and orthogonal elements to pull bits together into a whole
      • Declarative vs Imperative aspects of our file format, the make heritage, and how this can cause problems
    • How bitbake works — see Beth's chapter of the architecture of open source applications
      • How dependencies of various forms are used to produce a final task graph for execution
    • The pieces and how they fit together
    • OE and Bitbake from a historical perspective
    • How OE/Poky handle cross-compilation issues
  • Best Practices Pitfalls to avoid, smoothing out project interactions
    • Common problems
      • Heavily duplicated metadata
      • PR/PRINC not utilized properly (less of an issue in the future)
      • Crossing the distro/machine/image boundaries
      • Not realizing how variables are handed off between OE and the buildsystem in question
      • Reinventing the wheel, not leveraging existing upstream functionality
    • Overall Principles
      • Minimize the metadata that has to be maintained
      • Keep open communication with upstream, push early and often, and start communication with upstream about major endeavors as soon as possible, wide communication channels (don't email individuals alone)
      • Retain high flexibility, and the orthogonality of the distro, machine, and image mechanisms
        • Properly utilize DISTRO_FEATURES and MACHINE_FEATURES
          • Explain packagegroup-base utilizes the two variables to determine what goes into the baseline image
        • IMAGE_INSTALL_append in a distro .conf, extensive use of *_RRECOMMENDS, *_RDEPENDS
        • How bitbake determines what to build, and what to include in an image
        • Package groups (both types) and how and where they're used
      • New variables (particularly those intended for the user) should be both documented and typed (when appropriate)
    • Layer Maintenance
      • Keep layers as low impact as possible
        • Layers should not affect build behavior (other than recipe availability) by their inclusion. The changes in behavior they provide should be opt-in, not opt-out.
          • Key points:
            • Use the machine override for bsp layers
            • Use the distro override for distro layers
            • Do as little as possible in layer.conf
    • Recipe Creation & Maintenance
      • When should variables be exported / how bitbake expands its variables
      • Use of EXTRA_OEMAKE, EXTRA_OECONF, EXTRA_AUTORECONF, acpaths, B
      • RDEPENDS, RRECOMMENDS
      • LICENSE variable format (boolean logic)
    • General Perils of Flexibility
      • bbclasses vs .inc files for optional functionality, prefer bbclasses to inclusion of .inc from .conf Consider disabling bitbake's ability to include a .inc from a .conf?
    • Bbappend maintenance
    • When to use and when not to use certain available features
      • LICENSE="CLOSED"
      • Use of a .inc file for a recipe
      • anonymous python functions, event handlers, inline python
      • bbclasses versus .inc files
      • addtask
    • Leverage all the abilities and tools we have available, including the lesser known
      • Dependency explorer UI
      • devshell / menuconfig (though this is becoming better known)
      • bitbake -f and bitbake -C
      • bitbake -S, bitbake-diffsigs, bitbake-dumpsig
      • Variable typing
      • PACKAGECONFIG
      • Various bitbake variables
        • BBINCLUDELOGS, BBINCLUDELOGS_LINES
        • BBVERSIONS
        • BB_SCHEDULER
      • Optional classes
        • Old
          • recipe_sanity
          • missingdeps
        • Inherited by default in MEL
          • typecheck (see Variable typing)
          • sstate-reuse
          • isolated-sstate-dir
  • Debugging What tools are available to us, and when to use each, for what purpose
    • Use of bitbake -g
    • Use of bitbake -g -u depexp
    • Use of bb & bitbake-env
    • Use of devshell
    • Use of bitbake -e
    • Use of the python debugger
    • Important areas of TMPDIR to be aware of (and inspect)
  • Tutorial / how-to Step-by-step, how to get the job done
    • Create recipes
    • Create layers
    • Use of various OE/Yocto features (e.g. optional classes)
    • Create a distro
      • Important variables
    • Create a machine
      • Important variables

Specific proposed tasks

  • Future Directions

  • We should be more consistent between bitbake and oe/scriptutils where they do common tasks.

    • Log formatting and setup
    • Argument parsing
  • We should revisit how we add python search paths and import modules for use in our eval'd python code.

    • Add a directive for layer.conf to add specific dirs to the python search path.
    • Ideally avoid globally altering sys.path or sys.modules in bitbake's server, adding better isolation for python event handlers, anonymous python, and inline python. Tasks at least are isolated from a process perspective, but not the others. Can we alter how we do our importing to not cache anything and only add them to our context

Metadata

  • Handle plugins in a consistent way. devtool and recipetool both use scriptutils.load_plugins, but wic uses its own thing, wic.pluginbase, wic.plugin.
  • Submit removal of obsolete recipe_sanity.bbclass
  • Make use of task-level exports to reduce global exports in the metadata
  • Reduce usage of _append/_prepend/_remove in favor of immediate alternatives,

Old pending branches/gists

Bugs

  • wic can load plugins from layers, but it doesn't seem possible for layers to override plugins which exist in the default location. This should be fixed
  • oe-selftest should use imp.get_suffixes() rather than hardcoding .py when loading plugins. Also double-check devtool/recipetool in this regard.

Thoughts on the ESDK

See also https://wiki.yoctoproject.org/wiki/Future_Directions#Usability

populate_sdk_ext is rather messy. There's so much hardcoded about what packages get included and controlled via booleans rather than obeying variables listing what targets we want. and assumes an esdk has to involve an image, so assumes the target is an application developer. it handles that use case well, but that's not the only one where an esdk could be useful

Also can't change SDKDEPLOY as it uses that to pull buildtools-tarball, not just to control output..

It kind of feels like archiver, a ton of little variables to control behavior, which isn't usually how we like to do things. I'd like to see more flexiblity on what's included, vars more like TOOLCHAIN_*_TASK rather than a bunch of flags to flip, and potentially a new SDKEXT_FEATURES or something

might be nice to still be able to run bitbake from an esdk also, not only through devtool, but that might not be a valid use case, we should articulate the use cases we're looking to satisfy with it to begin with

RP: kergoth: right. I'd also like to be able make an existing build into an eSDK and vice-versa RP: kergoth: that is kind of the concept that was behind the bblock and bbunlock commands there is a bugzilla entry for

Archived / Obsolete

Bugs

  • libx11 is pulled into a console-image by alsa, pulseaudio, consolekit (by pulseaudio-module-console-kit), and dbus, but x11 in dbus is used for autolaunch stuff, which we might want
    • gtk+ config in alsa-tools
    • x11 config in dbus
    • x11 config in pulseaudio
  • rpcbind service is enabled, but inactive/dead, not being started on boot
  • In meta-sourcery, add a postprocess step to do_install_ptest_base to fixup ownership on ${D}${PTEST_BASE}, since we bypass pseudo for our gcc binary for the cs-license issue, and most ptest installation uses cp/tar to copy parts of the source tree.
  • dbus: needs libuuid.la, missing dep?
  • uclibc recipes use := to filter out CFLAGS, but it's done very early on in the parse process. if PV is in CFLAGS, this can result in an early expansion of a wrong PV, or an expansion of a PV before SRCREV is defined, resulting in a ParameterError.
  • mdadm: needs groff-native

Already done upstream

  • Drop libtool archives. http://blog.flameeyes.eu/2011/05/03/surviving-without-libtool-archives. This is done by default in defaultsetup.conf via inclusion of remove-libtool.bbclass.
  • Add support for task-level exports to bitbake - ala https://gist.github.com/kergoth/8410245 but in bitbake itself
  • Event handlers require this common wrapper crap. Abstract that away by specifying the events in a variable flag. NOTE: If there was an events flag, "addhandler" could go away in favor of only calling python functions with the flag. Update eventhandler flag exists for this now.
  • Sanitized host bin directory w/ forced PATH. This would make our builds slightly more deterministic. Update HOSTTOOLS does this now.
  • Create a python module to act as a wrapper around subprocess/popen/system. Update bb.process.run
  • Change the anonymous function concatenation to prevent short circuiting. Update current bitbake runs each anonymous python function separately, not concatenated.
  • Add a commandline argument which runs the specified task against all dependent recipes. Would make do_buildall, do_fetchall, etc obsolete. Update --runall=

Superceded

  • Look into what's needed to avoid nostamp with externalsrc builds. Presumably capturing a proper git version from the externalsrc git repository would do, but it'd need to know to invalidate caches when HEAD changes, much as my old srctree+gitver did.
  • Switch BBHandler to the same API as my updated ConfHandler
    • Slightly less trivial, as it has a mess with global state and get_statements and the like. Refactor to avoid the many global state variables and all first.

Now-unnecessary

  • Turn CookerParser into an iterator, rather than the custom parse_next() bits. Update parse_next wraps a generator, don't think this is needed.

Decided against

  • Add a feature for image.bbclass which limits the number of previous files/images to keep around. E.g. current+3. This would be particularly helpful when building 4gb sd card images with wic, for example. We do have the ability to remove all old images via RM_OLD_IMAGE, iirc, but a count would be more flexible.

  • Consider implementing external-toolchain.bbclass using the fetcher's localpath handling to locate the files, but with a custom unpack. We'd need to handle missing files differently, since we don't want errors at parse time for missing files, so we need our own variables, not SRC_URI, FILESPATH, or PREMIRRORS. I could alter the metadata to redirect WORKDIR to B or some other subdir of WORKDIR, so the files all get put in an isolated location which is then installed to ${D}. See https://gist.github.com/kergoth/1fd4bfcf46045305d259 for reference. That approach would have some issues, as the fetcher resolves symlinks, etc.

  • devshell

    • Look into emission of all task shell functions into the devshell. This is non-trivial, but would add value in that the user would be able to run shell tasks directly, as well as convenience functions like oe_runmake, rather than having to run the emitted run scripts (which may not have been emitted yet). Debatable. Give further thought.
  • Adjust the oelite ply based parser to work with the current file format, again, and integrate it into current bitbake
    • The lexer should work mostly fine once we revert some of the file format change bits, but the parser is rather different, since its operations use the oelite metadata. Even so, it should just be a matter of tweaking the contents of the various p_ methods Update I did investigate and prototype this, but it adds a lot of complexity with little gain.

OpenEmbedded Musings

  • Current thought: a recipe only really needs to know how to:
    • Clean
    • Source Artifacts -> Prepped Artifacts
    • Prepped Artifacts -> Deploy into structure that the project expects to run in
    • How you fetch something isn't specific to a given recipe, nor is how you package it, etc. How to break up the Deployed Artifacts is more a distribution policy choice than a recipe choice, and how you do the breakup / packaging is common. We just need to take those deployed artifacts and do things with them, and that doesn't have to be tasks for this recipe, but instead move up to a slightly higher view.
    • Recipe
      • Descriptive metadata as provided by/about upstream
      • Tasks
        • Clean
        • Source Artifacts -> Prepped Artifacts
        • Prepped Artifacts -> Deployed Artifacts
  • Split up execution of the above two operations on a recipe within the package's source tree from the bits that handle fetching / building multiple recipes / etc, possibly packaging and stuff too. First part = build using our classes, Second part = higher up enterprise level build tool type thing, maybe a continuous integration / build server type tool.
  • Our inline python snippets are essentially an internal dsl, but nowhere is it documented. For example, os is pre-imported, and 'd' is the current datastore, but that isn't documented anywhere. Adding some useful helper functions for the dsl would be appropriate.
  • What about revamping the whole concept of stamps?
    • Inline python script attached to the task which checks the stamp and returns a bool indicating whether the task needs to be run, perhaps. The default could be "os.path.exists(stamp)" or whatever.
    • We'd want to ensure that tidbit of python can check the dependent task stamps and stuff too.

OpenEmbedded Plan

[ ] Switch to git for our sources + patches Update devtool does this, but it isn't kept that way. git_emit script takes a recipe, goes over its SRC_URI, and creates a git repository containing the source tarball contents and patches (as topic branches) referenced in the SRC_URI. Then we'll re-emit the recipe pointing at the git repository. [ ] Figure out how to arrange the branches for the patch variants [ ] Alter git_emit to track SRC_URI through the recipe's git history [ ] Find/write a bitbake recipe parser which can write it back out with changes [ ] Modify the git_emit script to modify SRC_URI [ ] Test a build

Performance improvement

  • Consider use of cython for components -- note, need to reduce the tight binding between bitbake modules
  • Consider reworking the current metadata interface to simply be a different API/representation of the parser AST
  • Document usage of meliae and heapy for memory profiling
  • Look closer at profiling -- Richard has mentioned he cannot seem to nail down the real underlying systematic performance problem
  • Try out-of-process logging to reduce the log write overhead
    • Note, we already sort of do this via the messages being sent to the UI, but the socket is shared for both logging and events. Determine if this is problematic, and attempt to determine how much time is spend waiting for log methods to return. Potentially consider adjusting the buffer size for the UI message queue
    • TODO: make tasks a no-op, then do a build with and without log records being pushed to the UI
      • Did this, the difference was pretty minimal, as a typical build doesn't send that many log messages to the UI, it seems. I wonder if the log file bits are an impact? Test those next.
  • Suspect worst memory usage offender during the parse is the depends_cache, as our cache is fully constructed in RAM before dumping due to parse time / performance
    • For oe-core, the RAM usage during the parse hits around 240 megs, then drops down to around 70 after the up front parse for the rest of the build
    • Looks like individual task processes have a typical unique set size of 34 megs
  • Task execution is still rather slow even when exiting the forked task immediately with os._exit() rather than exec_task, as richard pointed out
    • On my VM, it took 2.5 minutes from scratch to exec of pseudo-native/do_populate_sysroot with no-op tasks, and parsing just oe-core
    • Each task appears to take .6 to 1 seconds even if it parses but does not execute an actual task
      • For one task's parsing on this machine: 406377 function calls (401802 primitive calls) in 0.552 seconds

Generate a dep graph that includes info about what precisely pulled it in

Workflowy

Old/Tasks/Bitbake

  • General
  • Cache
  • Cooker
    • Turn CookerParser into an iterator, rather than the custom parse_next() bits
  • Data
  • User Interfaces
    • Move the default log formatter into a common place
  • Bugs
  • Design Problems
    • Tight Coupling "Modules aren't sufficiently orthogonal"
    • bb.parse tries to be pluggable for different file formats, but it's a false flexibility "The file format is tightly bound to the metadata, both bb.data and DataSmart. If we used a different file format, bb.data wouldn't necessarily make sense in its current form."
    • Events are global, rather than being bound to the recipe instance or cooker or runqueue
    • File format
      • Not fully declarative
        • inherit
        • include/require
  • Build

Random OpenEmbedded thoughts https://gist.github.com/kergoth/9983394

Dislikes

Likes

  • Layered metadata facilitates independent maintainers, as well as making it easy to break up changes in a logical way, and avoid having to modify upstream when not necessary.
  • Easy to collaborate, due to our built in support for our orthogonal axes: distro, machine, image, and conceptual layers to handle variable specificity (global, distro, arch, machine, ..)
  • "Class" mechanism lets us abstract out common functionality from recipes to pare them down to the bare essential description of the upstream project (ideally)

WouldLike

  • Improved method for making local changes to source trees, without the overhead imposed by the fetch/unpack/patch mechanism. Update devtool does this.
  • Some form of isolated sysroots to reduce potential cross-recipe contamination. Update per-recipe-sysroots does this.
  • In addition to handling input checksums, we should also respond to changes in the output of an operation, so a change to the input which doesn't affect the emitted binary packages won't result in rebuilding the dependencies of said package. Update hashequiv does this.

Reference and Context

Old Code for Reference

History

The OpenEmbedded project has come far from its humble beginnings, and I'm proud to have been one of those who founded it. Today I'm going to take a step back to those beginnings and discuss some of the circumstances around its creation, the requirements that fed into the original design, and a bit of the reasoning behind the designs, so that there is a better understanding of why OE is the way it is today.

Many years ago, I started and maintained an open source project called "OpenZaurus". This project was a from scratch Linux distribution for the Sharp Zaurus line of PDAs, built from source. Originally, it was a heavily customized version of the well known "buildroot" project. After using and maintaining it for some time, it became clear that it suffered from some serious scalability problems, and much time was spent working around the limitations of GNU Make.

There were then discussions to determine the goals and requirements for a replacement which would better meet our needs. We spent time reviewing each of the tools we knew about that might have been suitable for our needs, including Gentoo's "portage", Debian's "buildd", emdebian, and a number of others.

Needs for a buildroot alternative, from memory:

- Does not require a specific distribution for the build machine, as that would limit the developers that would use it and contribute to the project
- Is fairly self contained - that is, the dependencies upon the build machine are limited, and it doesn't require a great deal of setup of your machine to get it up and running.  buildroot is a great example of the sort of thing we wanted in this particular respect
- Written in a language more flexible than GNU Make
- Directly parseable metadata and build rules.  One reason we felt this was important, because in the long term we expected the format to change, and we may well want to translate to a different file format. Further, having a format which is easy to directly parse means we can do interesting analysis and operations against that metadata in a way which is less limited, and allows us to potentially use something other than shell for this.

    - This already disqualified debian source archives, due to their reliance upon debian/rules as a makefile

- Support for any arbitrary target operating system, distribution, architecture, machine

    - At the time, this led to a belief that cross-compilation would be an essential feature. Even now, crosscompilation performs best, though emulation and board performance has come along quite far.

- Must have the ability to define portions of the metadata conditionally, based upon the configuration (what distribution, machine, etc).  Must also be able to use configuration files and patches conditionally.  Conditionals we know we must support: distribution, architecture, machine, "local" (local being the changes the user makes, which should always override what OE provides to them).
    - This is required to facilitate collaborative development between folks with widely varying needs
- Must support some level of abstraction for common operations, to avoid duplication of code (even shell) between recipes.  Ideally, we'll support 'classes' for things like the buildsystem of the recipe, to simplify matters for the maintainers of the metadata
- Should be able to handle fetching sources from a remote location, so we don't have to distribute all the sources with the metadata
- Should be able to unpack those sources, and apply patches to them, so the original pristine upstream sources are available, and are clearly what we base upon, rather than using single source archives of our changes upstream

When we reviewed the various candidates, we came to the conclusion that Gentoo's Portage was the closest to meeting our needs, but was insufficient as is for various reasons (no inherent crosscompilation handling, used shell as a file format, required that the build distribution be gentoo, though this was common amongst the tools we looked at, and so on). We then initiated work on our own tool, based upon the portage python code.

Decisions and reasoning regarding our implementation:

- Everything is metadata, as this gives us a great deal of flexibility.  "Tasks" are just key/value pairs like any other variable, but with an associated "flag" controlling their behavior.
- Tasks can be either python or shell (ideally, we'd be able to associate an 'interpreter' with the task as a flag, to make this generic)
- Variables can be exported into the environment of the tasks, on a per variable basis
- Collaborative repository — multiple distributions, machines, etc can all work against the same recipes, to ensure that they all benefit from the work done
- In order to support the crosscompilation requirement, we must have an area which is common to all recipes, where they can place files for other recipes to get at them (headers, libraries, etc)
- Having discussed it a fair bit, it becomes clear that our current conditionals have a certain specificity, which varies from one to the next.  This allows us to implement our conditionals as conceptual layers in the metadata.  This layering allows more specific information to always override less specific information.  We implement this layering by specificity in three ways:

    1. loading a configuration file for each, in least specific to most specific order (where possible)
    2. OVERRIDES, which is the per variable conditional handling, applied from least specific to most specific
    3. for relative file: urls, we automatically prefer the most specific version of the file, as reflected in the directory structure traversed

- Recipes must be able to inherit classes, and refer to the class versions of the tasks when it overrides the things the task has done.  As an example, a recipe should be able to override the configure task, but make theirs run the autotools configure task first, then their additional steps after that.  This is not unlike classes in an object oriented language, where methods are able to call the methods of their superclass.

Some of the substantial changes since its creation:

- Events
- Split BitBake apart from OpenEmbedded
- "New style", automatic, staging
- Packaged staging, which was then superseded by shared state
- BBCLASSEXTEND
- SRCREV / AUTOREV
- sanity / insane and other useful checks

It's my hope that this historical overview will be helpful in grasping certain aspects of how the OpenEmbedded, Poky, and related projects work, and why they work the way they do.

User Stories

  • Build a root filesystem or image to run on your device
  • Cross compile your application or kernel
  • Actors
    • Application Developer
    • Kernel Developer
    • Configuration Manager
    • Distribution Maintainer
    • Integrator

Lesser Known Yocto/OE Capabilities

Date: 2012-08-21

Let’s go over a few Yocto/OE capabilities which seem to be less commonly known.

Variable Typing

The first item I will cover is the "Variable Typing" support. This feature allows one to declare the 'type' of a given metadata variable. This both enables a type check to ensure that it's syntactically valid, and returns an object of the appropriate type when using the python interface. Admittedly, this is promotion of something I created, but I do believe there's value here, and future posts will cover a variety of features created by a variety of individuals.

The implementation in oe-core consists of two components:

  • Type creation python modules, whose job it is to construct objects of the defined type for a given variable in the metadata
  • typecheck.bbclass, which iterates over all configuration variables with a type defined and uses oe.types to check the validity of the values

This gives us a few benefits:

  • Automatic sanity checking of all configuration variables with a defined type
  • Consolidates the logic surrounding what to do with the value of a given variable. For variables like PATH, this is simply a split(), but for boolean variables, any duplication can result in unclear semantics. For example, we may not know whether our choices are 1/0, empty/nonempty, true/false, etc.
  • Make it easier to create a configuration UI, as the type information could be used to provide a better interface than a text edit box (e.g checkbox for 'boolean', dropdown for 'choice')

To enable configuration time type checking of the config variables which have the appropriate metadata, set the following: INHERIT += "typecheck". This is done by the 'mel' distro automatically.

List of currently available types: list, choice, regex, boolean, integer, float

An example of the appropriate metadata with failing typecheck:

BAZ = "foo"
BAZ[type] = "boolean"

$ bitbake -p
FATAL: BAZ: Invalid boolean value 'foo'

Further examples of metadata values:

FOO = "alpha"
FOO[type] = "choice"
FOO[choices] = "alpha beta theta"

PACKAGES[type] = "list"

LIBTOOL_HAS_SYSROOT = "yes"
LIBTOOL_HAS_SYSROOT[type] = "boolean"

PATH[type] = "list"
PATH[separator] = ":"

Examples of usage of these from python:

python () {
    import oe.data
    for pkg in oe.data.typed_value("PACKAGES", d):
        bb.note("package: %s" % pkg)

    for path in oe.data.typed_value("PATH", d):
        bb.note("PATH element: %s" % path)

    assert(oe.data.typed_value("FOO", d) == "alpha")
    assert(oe.data.typed_value("LIBTOOL_HAS_SYSROOT", d) == True)
}

OE python package

The OE python package has a number of modules which provide some valuable pieces of functionality. Please, take the time to look over these modules before writing any python in the metadata, as they can prevent unnecessary reinvention of the wheel.

Examples (random selection of bits from the oe package):

  • oe.path.relative - return a relative path from src to dest
  • oe.path.copytree - recursively copy a tree from src to dest
  • oe.path.find - acts like the ‘find’ command, traversing a tree and yielding the full paths to the files
  • oe.utils.inherits - return True if the metadata inherits from any of the specified classes

Classes

  • externalsrc

    This class is used to build from an existing local source tree, rather than the usual fetch/unpack/patch process. This is an invaluable tool for active development.

  • gitver/gitpkgv

    gitpkgv is used to extract the version from a git repository referenced from SRC_URI, and use it in the package versioning. gitver is used for the same purpose, but to extract from a local repository, rather than a bitbake fetched repository

  • lib_package

    This class is used to split out the binaries from the main library for recipes whose primary artifact is the library

  • recipe_sanity (used via INHERIT)

    This class is used to run a set of basic sanity checks against the metadata of a recipe. It checks for some common user errors.

  • archiver (used via INHERIT)

    This class is used to archive sources/patches/logs for a recipe, usually for license compliance. It is quite configurable, to cover the common use cases, and ensure it can meet the needs of any legal department.

OpenEmbedded - Modifying variables using OVERRIDES

One common source of confusion in OpenEmbedded, specifically regarding its file format, is the behavior of appending/prepending to variables combined with conditionals. In OpenEmbedded, all conditionals in the file format are based on “OVERRIDES”. These conditionals all test whether the given string is in OVERRIDES, or is not.

The source of confusion lies in a quirk in syntax. There are two ways to both append and override in a single operation:

  • Append to a conditional variable: FOO_condition_append = “ bar”
  • Conditionally append to a variable: FOO_append_condition = “ bar”

The difference between the two is the loss of the previous value of FOO.

Example of the first syntax:: FOO = “bar” FOO_condition = “baz” FOO_condition_append = “ boo”

Here, the final result, assuming the condition is true, is “baz boo”. To consider the order of operations, first it appended “ boo” to “baz”, then FOO_condition replaced FOO when overrides were processed. This occurs even if FOO_condition is not defined. For example::

FOO = “bar” FOO_condition_append = “ baz”

Here, the final result is “ baz”, and the original value “bar” was lost. To again consider order of operations, “ baz” is appended to an empty FOO_condition, and this newly defined FOO_condition replaces FOO. This is very very rarely what you want, but the syntax is flexible, so allows you to freely shoot yourself in the foot.

With the second syntax, evaluation is a single step, not multiple steps, and is much clearer::

FOO = “bar” FOO_append_condition = “ baz”

If the condition is true, the final result is “bar baz”. This is a single operation -- append this value if the condition is true.

I hope this clarifies this particular OpenEmbedded quirk, as I’ve seen this as a source of confusion in the past. This is one of many quirks related to the order of operations for bitbake’s processing, and is ideally something the user should not have to concern themselves with.

OpenEmbedded - Metadata Structure - Distro, Machine, Image

There are 3 major orthogonal axes governing the build:

  • Distro
  • Machine
  • Image

The image solely controls what gets built, via dependency, and in general any image should work with any combination of machine and distro, within reason. When you want to add a random package to your device's image, 90% of the time it belongs in the image recipe, unless it's required to boot the board. The image sets an IMAGE_INSTALL variable, or its related incarnations, to add individual packages to the root filesystem. There also exists an "IMAGE_FEATURES" variable. This is a space separated list of words which governs how the image will be built. In some cases, they affect configuration of the filesystem, but in other cases, it merely selects a group of packages to be installed, as a convenience for the user. The only users of this variable are the image classes that underlie the image recipes.

The machine governs the hardware dependencies throughout the build. It selects what kernel version and recipe to use, what bootloader version and recipe to use (indirectly, usually through COMPATIBLE_MACHINE in the recipes), and so on. It also defines "MACHINE_FEATURES". This is a space separated list of words, used to declare the capabilities of the hardware. There is no specific list of what words are allowed here, and the recipes can look for any word there. There are, however, conventions used, and the behavior of the packagegroup-base recipe, which is a bit special.

packagegroup-base is a rather important, central task, and is included in nearly every image, as it provides a base set of functionality given your machine and distro. I'll return to packagegroup-base in a moment, after we describe the distro.

The notion of a Linux distribution in OE/Yocto context differs slightly from what a traditional Linux distro is – in particular, it generally doesn't govern what the image does (package selection). The rest of it, though, is quite similar. It governs what package management system is in use, and sets a wide variety of variables that control aspects of the build. For example, whether we support NLS, what glibc binary locales to generate, the use of optional classes to change how things build. It also defines an important variable: DISTRO_FEATURES. These are the sorts of decisions that anyone would have to make when creating a Linux distribution. "What do I want to support?"

DISTRO_FEATURES is another space separated list of words without a specific list of what can be defined. Recipes look for words here to control how they build. Another way of describing this variable is a list of things we want to support for the distribution we're producing. For example, whether we want to support "ipv6".

For example, many recipes which can have optional dependency on libx11 will control that dependency based on the inclusion of the 'x11' distro feature.

Finally, there is a variable "COMBINED_FEATURES", which has a very specific purpose. It's intended to hold words which exist in both the machine and distro features. This intersection is where hardware support (e.g. bluetooth) intersects with what the distro chooses to support. In this way, we can ensure that we support the critical pieces of hardware, while leaving certain aspects in the distro's control. For example, the hardware may have a a DAC and ADC, yet the distro might choose to not leverage that capability, and so it leaves out the 'audio' feature.

The packagegroup-base recipe is the main place where distro and machine features intersect. The common pattern is that hardware support for something results in a task being emitted for it, so that task is available to be installed. If the distro also supports that capability, then that task is automatically pulled into default images via dependency from the packagegroup-base and packagegroup-base-extended tasks. This can result in a slightly greedy dependency tree, but ensures that we have what we need to support what the distro wants to support, in any given image. For images that don't care about this, they can choose to not use the packagegroup-base tasks.

It's important to remember that distro, machine, and image are always intended to be orthogonal, and this guides a great deal of what we do. We want any combination of the 3 to work, within reason. So, if a given package is required to boot a board, then it needs to be pulled in via the machine, otherwise use of a different distro would result in non-booting images. If a given package isn't required to boot the board, then it belongs in the hands of the distro or the image, and needs to be considered on a case-by-case basis.

Related to DISTRO_FEATURES is USER_FEATURES. This is a function mentor has added to make it easier for the user to manipulate the distro features. With this variable, they can add a distro feature (e.g. USER_FEATURES += "x11"0 or remove a distro feature (e.g. USER_FEATURES += "~audio").

It's also important to remember that the features variables can be seen as statements of intent. It's up to the rest of the metadata – the recipes and classes, to implement what the distro wants, and what the hardware requires/supports. This is why there's no list of what features are "available", as the control of behavior based on these variables lies in the hands of the recipes and classes, not in the configuration metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment