Skip to content

Instantly share code, notes, and snippets.

@pkuczynski
Last active September 23, 2024 01:42
Show Gist options
  • Save pkuczynski/8665367 to your computer and use it in GitHub Desktop.
Save pkuczynski/8665367 to your computer and use it in GitHub Desktop.
Read YAML file from Bash script
#!/bin/sh
parse_yaml() {
local prefix=$2
local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
sed -ne "s|^\($s\)\($w\)$s:$s\"\(.*\)\"$s\$|\1$fs\2$fs\3|p" \
-e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p" $1 |
awk -F$fs '{
indent = length($1)/2;
vname[indent] = $2;
for (i in vname) {if (i > indent) {delete vname[i]}}
if (length($3) > 0) {
vn=""; for (i=0; i<indent; i++) {vn=(vn)(vname[i])("_")}
printf("%s%s%s=\"%s\"\n", "'$prefix'",vn, $2, $3);
}
}'
}
#!/bin/sh
# include parse_yaml function
. parse_yaml.sh
# read yaml file
eval $(parse_yaml zconfig.yml "config_")
# access yaml content
echo $config_development_database
development:
adapter: mysql2
encoding: utf8
database: my_database
username: root
password:
@byronmansfield
Copy link

This is fantastic! Thank you all who have contributed to this. It was exactly what I was looking for. I think it was mentioned above by @wadewegner, it has issues with hyphenated hash keys such as proj-ect:. wadewegner you said that your first pure bash change worked for you, but had array issues. I was unsuccessful with using your approach. And unfortunately I am not able to introduce other tooling like python packages such as niet into my project. Has anyone been able to get this working for hyphenated keys?

@misterboe
Copy link

How could i "echo" all given variables?

@BGMP
Copy link

BGMP commented Feb 3, 2019

Thank you very much!

@djentlguy
Copy link

djentlguy commented Mar 13, 2019

Trying to use Martins modified parser from stckovrflow. Input file:

schemas:
- name: exports
  tables:
  - name: wolverine
    description: sample message here
      may overflow next line
    active_date: 2019-01-07 00:00:00
    columns:
    - name: height
      type: timestamp without time zone
    - name: strength
      type: bigint
      description: sample message here
        may overflow next line
      example: 21352352
    - name: power
      type: bigint
      description: sample message here
        may overflow next line
      example: 10001
  - name: cyclops
    description: sample message here
      may overflow next line
    active_date: 2018-12-15 00:00:00
    columns:
    - name: size
      type: datetime
      description: sample message here
        may overflow next line
      example: 2018-03-03 12:30:00
    - name: power
      type: timestamp without time zone
      description: sample message here
        may overflow next line

Parse function:

function parse_yaml {
   local prefix=$2
   local s='[[:space:]]*' w='[a-zA-Z0-9_]*' fs=$(echo @|tr @ '\034')
   sed -ne "s|,$s\]$s\$|]|" \
        -e ":1;s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s,$s\(.*\)$s\]|\1\2: [\3]\n\1  - \4|;t1" \
        -e "s|^\($s\)\($w\)$s:$s\[$s\(.*\)$s\]|\1\2:\n\1  - \3|;p" $1 | \
   sed -ne "s|,$s}$s\$|}|" \
        -e ":1;s|^\($s\)-$s{$s\(.*\)$s,$s\($w\)$s:$s\(.*\)$s}|\1- {\2}\n\1  \3: \4|;t1" \
        -e    "s|^\($s\)-$s{$s\(.*\)$s}|\1-\n\1  \2|;p" | \
   sed -ne "s|^\($s\):|\1|" \
        -e "s|^\($s\)-$s[\"']\(.*\)[\"']$s\$|\1$fs$fs\2|p" \
        -e "s|^\($s\)-$s\(.*\)$s\$|\1$fs$fs\2|p" \
        -e "s|^\($s\)\($w\)$s:$s[\"']\(.*\)[\"']$s\$|\1$fs\2$fs\3|p" \
        -e "s|^\($s\)\($w\)$s:$s\(.*\)$s\$|\1$fs\2$fs\3|p" | \
   awk -F$fs '{
      indent = length($1)/2;
      vname[indent] = $2;
      for (i in vname) {if (i > indent) {delete vname[i]; idx[i]=0}}
      if(length($2)== 0){  vname[indent]= ++idx[indent] };
      if (length($3) > 0) {
         vn=""; for (i=0; i<indent; i++) { vn=(vn)(vname[i])("_")}
         printf("%s%s%s=\"%s\"\n", "'$prefix'",vn, vname[indent], $3);
      }
   }'
}

Current Output :

1="name: exports"
1_1="name: wolverine"
1_1_description="sample message here"
1_1_active_date="2019-01-07 00:00:00"
1_1_1="name: height"
1_1_1_type="timestamp without time zone"
1_1_2="name: strength"
1_1_2_type="bigint"
1_1_2_description="sample message here"
1_1_2_example="21352352"
1_1_3="name: power"
1_1_3_type="bigint"
1_1_3_description="sample message here"
1_1_3_example="10001"
1_2="name: cyclops"
1_2_description="sample message here"
1_2_active_date="2018-12-15 00:00:00"
1_2_1="name: size"
1_2_1_type="datetime"
1_2_1_description="sample message here"
1_2_1_example="2018-03-03 12:30:00"
1_2_2="name: power"
1_2_2_type="timestamp without time zone"
1_2_2_description="sample message here"

But I need to get the values in nested format csv instead of synthetic numbered markers. The idea is to populate a nested relationship from parent to child elements . Anything in the same level will be written delimited by comma like below:

Wolverine.height,"timestamp without time zone",,
Wolverine.strength,bigint,"sample message here may overflow next line",21352352
Wolverine.power,bigint,"sample message here may overflow next line",10001
Cyclops.size,datetime,"sample message here may overflow next line","2018-03-03 12:30:00"
Cyclops.power,"timestamp without time zone",something,"sample message here may overflow next line",,

How can I reformat it? Can someone help ?

@tec82263
Copy link

Hi,
I tried to use this in Jenkins for job config parsing.
I ran successfully with Python 2.6.6 but not with Python 2.7.5
Help is really appreciated.

Ran with Python 2.6.6

  • echo 'Get Job Properties'
    Get Job Properties
  • /usr/bin/python -V
    Python 2.6.6
  • cd /bms/webapps/jenkins/workspace/pro-group/test/yaml_parser
  • . parse_yaml.sh
    ++ parse_yaml access_info.yaml accessconfig_

Ran with Python 2.7.5

  • echo 'Get Job Properties'
    Get Job Properties
  • /usr/bin/python -V
    Python 2.7.5
  • cd /bms/webapps/jenkins/workspace/pro-group/test/yaml_parser
  • . parse_yaml.sh
    /bms/webapps/jenkins/jenkins881879139786658308.sh: line 29: .: parse_yaml.sh: file not found
    Build step 'Execute shell' marked build as failure

@TenzinCando
Copy link

Thank you for publishing! Helped save alot of time 👍

@4383
Copy link

4383 commented Oct 9, 2019

Hello,

niet now support eval output format so you can do things like that:

$ pip install -U niet
$ echo '{"foo-biz": "bar", "fizz": {"buzz": ["zero", "one", "two", "three"]}}' | niet -f eval .
 foo_biz="bar";fizz__buzz=( zero one two three )
$ eval $(echo '{"foo-biz": "bar", "fizz": {"buzz": ["zero", "one", "two", "three"]}}' | niet -f eval .)
$ echo ${foo_biz}
bar
$ echo ${fizz__buzz}
zero one two three
$ eval $(echo '{"foo-biz": "bar", "fizz": {"buzz": ["zero", "one", "two", "three"]}}' | niet -f eval '"foo-biz"'); echo ${foo_biz}
bar
$ echo '{"foo-biz": "bar", "fizz": {"buzz": ["zero", "one", "two", "three"]}}' | niet -f eval fizz.buzz
fizz_buzz=( zero one two three );
$ eval $(echo '{"foo-biz": "bar", "fizz": {"buzz": ["zero", "one", "two", "three"]}}' | niet -f eval fizz.buzz)
$ for el in ${fizz_buzz}; do echo $el; done
zero
one
two
three

niet work with JSON and YAML input, also you can convert each format to each other, just by using pip install -U niet.

@mundrusatish
Copy link

mundrusatish commented Nov 21, 2019

Thank you. Can we validate that the username as below is mandatory and have value, non empty field. Right now it passed.
development:
adapter: mysql2
encoding: utf8
database: my_database
username:
if anyone tried, please share. Thanks much

@bukowa
Copy link

bukowa commented Jan 26, 2020

@babubalagani
Copy link

@pkuczynski, It's work for me as well to read the key in yaml file. Thanks for that. Do you have the script to replace the value after reading? i have to replace the each value which i reads .

@inieves
Copy link

inieves commented Apr 25, 2020

Hi! What is the expected output from running test.sh? Perhaps listing an output file above would be useful.

@sonisrje
Copy link

Works great.

@Nawanop-AMNB
Copy link

nice script!

@niekwit
Copy link

niekwit commented Feb 14, 2021

Works really well, thank you!

@harryberto
Copy link

description: sample message here
may overflow next line

I have a similar issue how could we store description variable that may overflow into the next line in the same

1_2_1_description="sample message here"
TO
1_2_1_description="sample message here may overflow next line"

Thank you!

@aahnik
Copy link

aahnik commented Apr 3, 2021

I think python is superior in this scripting situation. why on earth anyone would use a bash script to do such a thing? python is generally pre-installed in mac and Linux.

python is so much clean and readable.

pip install pyyaml

import yaml

FILENAME = 'your_file.yml'

with open(FILENAME) as file:
    data:dict = yaml.full_load(file)

# data is a python dictionary

@flee2free
Copy link

The traversal logic is super. Love it !! Sometimes its just fun to do these crazy stuff with Bash.

@aleon1220
Copy link

jq for JSON and yq for YAML
https://github.com/mikefarah/yq

@kokosowy
Copy link

kokosowy commented Sep 8, 2021

Hi all! This is very interested thread. Anybody knows how to make it working where string spans several lines (through "|" pipe), example:

key1: |
   1st line
   2nd line
   3rd line
key2: |
   1st line
   2nd line
   3rd line

@sonjz
Copy link

sonjz commented Oct 28, 2021

thanks for this.
added this to end after awk as a hack not to interpret $ interpolation (application: reading serverless.yml config into bash)

 | tr "$" "#" # don't want to interpret $

@Nigam8972
Copy link

i cant figure out why its not running succesfully for me whenever i execute ./test.sh it gives me error
./test.sh
./test.sh: line 4: .: parse_yaml.sh: file not found
./test.sh: line 7: parse_yaml: command not found

@Nigam8972
Copy link

has anybody else faced this issue please help

@potes74
Copy link

potes74 commented Jan 30, 2022

@Nigam8972 try with
. ./parse_yaml.sh
if they are in the same folder (it's what I use)

@ppenguin
Copy link

ppenguin commented Mar 2, 2023

Cool, just discovered this; of course the more recent suggestions are more robust and feature-rich, but I found myself needing this on constrained (i.e. busybox, sh and no python) systems where going POSIX-sh only is the way to go.

The method here by @wadewegner (or similar) to pre-process problematic keys (but not their values) was the way to go, i.e. by using "advanced" sed hold-space etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment