Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Shell script that reads LDIF data from STDIN and outputs as CSV.
#!/bin/bash
#
# Converts LDIF data to CSV.
# Doesn't handle comments very well. Use -LLL with ldapsearch to remove them.
#
# 2010-03-07
# dsimmons@squiz.co.uk
#
# Show usage if we don't have the right params
if [ "$1" == "" ]; then
echo ""
echo "Usage: cat ldif.txt | $0 <attributes> [...]"
echo "Where <attributes> contains a list of space-separated attributes to include in the CSV. LDIF data is read from stdin."
echo ""
exit 99
fi
ATTRS="$*"
c=0
while read line; do
# Skip LDIF comments
[ "${line:0:1}" == "#" ] && continue;
# If this line is blank then it's the end of this record, and the beginning
# of a new one.
#
if [ "$line" == "" ]; then
output=""
# Output the CSV record
for i in $ATTRS; do
eval data=\$RECORD_${c}_${i}
output=${output}\"${data}\",
unset RECORD_${c}_${i}
done
# Remove trailing ',' and echo the output
output=${output%,}
echo $output
# Increase the counter
c=$(($c+1))
fi
# Separate attribute name/value at the semicolon (LDIF format)
attr=${line%%:*}
value=${line#*: }
# Save all the attributes in variables for now (ie. buffer), because the data
# isn't necessarily in a set order.
#
for i in $ATTRS; do
if [ "$attr" == "$i" ]; then
eval RECORD_${c}_${attr}=\"$value\"
fi
done
done
@stfnfrnr

This comment has been minimized.

Copy link

stfnfrnr commented Oct 18, 2013

awesome, just what I needed right now. Thanks!

@saric

This comment has been minimized.

Copy link

saric commented Feb 6, 2014

I've changed your script a little:

since CSV output is echoed line by line, there is no need to use counter "c" at all (it is useless), and therefore pseudo-array RECORD can be one-dimensional:

$ sed -i 's/RECORD_${c}/RECORD/g' ldif-to-csv.sh

I also needed possibility to parse multi-valued attributes, so I'm using "additional" delimiter ":". So, the latest part of modified script looks like this:

for i in $ATTRS; do
    if [ "$attr" == "$i" ]; then
        eval tmp=$(echo $(echo \$RECORD_${attr}))
        if [[ ! -z "$tmp" ]]; then
            eval RECORD_${attr}=$tmp':'\"$value\"
        else
            eval RECORD_${attr}=\"$value\"
        fi
    fi
done

The script still does not properly handle double quotes, if present in attribute value. Embedded double quote characters have to be represented by a pair of consecutive double quotes, or by prefixing an escape character such as a backslash.

Another possible issue is representation of non-ASCII characters. LDIF format uses base64 encoding for representation of values with UTF-8 characters.

@freilich

This comment has been minimized.

Copy link

freilich commented Mar 5, 2014

Added a check for base64 encoded fields and decode them:

# Separate attribute name/value at the semicolon (LDIF format)
# Check for double colon and decode the base 64 results
if [[ $line == *::* ]]
then
    attr=${line%%:*}
    value=`echo ${line#*: } | base64 --decode`
else
    attr=${line%%:*}
    value=${line#*: }
fi  
@mcreenan

This comment has been minimized.

Copy link

mcreenan commented Apr 24, 2014

Thanks this helped me move away from a simple perl script using Net::LDAP (which wasn't working anyway). Able to use this + ldapsearch instead, much better.

@joaogit

This comment has been minimized.

Copy link

joaogit commented May 15, 2014

Thanks!
With the freilich base64 "add-on" it was perfect for the job.

@jackcaesar

This comment has been minimized.

Copy link

jackcaesar commented May 28, 2014

First thing - great script - it is what I needed, so don't shoot me for the following issues! I'm thankful for the work done on it.

I think Saric's addition is good - I needed this for multiple mailalternateaddress fields which I imagine many LDIFs have), although if you have apostrophes (') in an email address (which is allowed and I came across..!) then you're going to get thrown an "eval: Unexpected end of file" error

If you change the line: eval RECORD_${attr}=$tmp':'"$value"

to eval RECORD_${attr}="$tmp':'$value"

Seems to handle it.

However this extra logic (using the tmp variable test) adds a long time to processing - fine on small LDIFs, but not something I could afford. (approx double it seems)

I made a slightly horrible hack and instead of making an IF statement, just did:

eval RECORD_${c}_${attr}+=" $value"

or, if you're removing the "c" counter (which saves a little code doesn't seem to save any time in processing):

eval RECORD_${attr}+=" $value"

The += and " " (space) means that you end up listing multiple valued fields space separated, BUT you always have a leading space even on single items - the expense of losing the "if" test...

In my use this is actually acceptable, I'm comparing 2 LDIFs, as long as there is consistently a leading space, then the diff is always correct. When I come to print the results a quick sed to replace the few instances of "open quote followed by space" is then a trivial and non time consuming task as there is often only a handful of changes - as I said, kinda horrible, but putting it out there in case anyone is interested.

I thought this was a great script - it still is, for less than a few k records, but it doesn't scale too well.

On an LDIF with 8500 entries (and the changes for multiple values I describe above) it takes 40 seconds for 7 attributes. Which I can accept.

But I have a much longer LDIF, 400,000 entries - takes about 20 minutes!

I assumed perl would be much quicker for such a task - but sadly I can't find a perl script that is "standalone" - This looks good http://www.stallies.net/index.php/convert-an-ldif-to-csv/ but it requires addition LDAP perl modules. Sadly I am working in a controlled environment, and cannot add anything to the filesystem.

Anyone got any ideas..! I am still looking, but thought I'd ask here too.

Thanks for the script!

@satishkumarsingh

This comment has been minimized.

Copy link

satishkumarsingh commented Jan 12, 2015

First of all my apologies, i am new for shell script. Second, This is something i was looking for but i seems to have some issue running this script.
sh ldif-to-csv.sh file.txt
when i run this in out put i get just filename ".txt"
may be i am missing some param or ?

Thanks for the script.

@szechuen

This comment has been minimized.

Copy link

szechuen commented Jun 14, 2015

Thanks for the bash script. It worked really well!

However, like @jackcaesar, I needed to serialize ~500k records corresponded with ~100 keys which took quite a while. Guess we've hit the ceiling in bash.

And so I reimplemented the serialization in C (https://github.com/szechuen/ldif_csv) and that cuts the running time to ~21s for me. Hope it helps someone ;)

@penguinjeff

This comment has been minimized.

Copy link

penguinjeff commented Jan 15, 2016

I did a bit of modding this to get what I needed out of it. As soon as I can figure out how to fork I will post my mods
First I needed a field to tell if an account was disabled or not and found out it was in the variable userAccountControl so I broke it up into all its Identifiers described here https://msdn.microsoft.com/en-us/library/windows/desktop/ms680832(v=vs.85).aspx
this might be useful for a lot of people. Second I broke out where it echos into a function and instead of writing on an empty line I write on the next occurrence of "dn" as this is what constituted a break according to ldif man page then after the loop I write out the last record.
I ran into a few base64 encoded values on operatingSystem value thanks to freilich his idea fixed this.

@penguinjeff

This comment has been minimized.

Copy link

penguinjeff commented Jan 15, 2016

A slight mod on freilich's code as I found some ipv6 addresses where causing errors

    if [[ $line == *::* ]] && [[ $line != *:*::* ]];then
        attr=${line%%:*}
        value=`echo ${line#*: } | base64 --decode || echo [$line] >&2`
    else
        attr=${line%%:*}
        value=${line#*: }
    fi 
@penguinjeff

This comment has been minimized.

Copy link

penguinjeff commented Jan 15, 2016

Here is what I did for UAC

after the line

ATTRS="$*"

I added

for i in $ATTRS; do
if [ "${i:0:7}" == "ADS_UF_" ] || [ "$i" == "Enabled" ];then
UAC_NEEDED="YES"
fi
done

then after the writing section I added

    if [ "$UAC_NEEDED" == "YES" ];then
            if [ "$attr" == "userAccountControl" ];then
                    #follow layout here https://msdn.microsoft.com/en-us/library/windows/desktop/ms680832(v=vs.85).aspx
                    binary=`echo "obase=2;${value}"|bc|awk '{printf("%032d",$1)}'`
                    for i in $ATTRS; do
                            bit=0
                            reverse=0
                            case "$i" in
                                    "ADS_UF_SCRIPT")                                        bit=31;;
                                    "ADS_UF_ACCOUNTDISABLE")                                bit=30;;
                                    # Inverted value of ADS_UF_ACCOUNT_DISABLE
                                    "Enabled")                                              bit=30;reverse=1;;
                                    "ADS_UF_HOMEDIR_REQUIRED")                              bit=28;;
                                    "ADS_UF_LOCKOUT" )                                      bit=27;;
                                    "ADS_UF_PASSWD_NOTREQD" )                               bit=26;;
                                    "ADS_UF_PASSWD_CANT_CHANGE" )                           bit=25;;
                                    "ADS_UF_ENCRYPTED_TEXT_PASSWORD_ALLOWED")               bit=24;;
                                    "ADS_UF_TEMP_DUPLICATE_ACCOUNT")                        bit=23;;
                                    "ADS_UF_NORMAL_ACCOUNT")                                bit=22;;
                                    "ADS_UF_INTERDOMAIN_TRUST_ACCOUNT")                     bit=20;;
                                    "ADS_UF_WORKSTATION_TRUST_ACCOUNT")                     bit=19;;
                                    "ADS_UF_SERVER_TRUST_ACCOUNT")                          bit=18;;
                                    "ADS_UF_DONT_EXPIRE_PASSWD")                            bit=15;;
                                    "ADS_UF_MNS_LOGON_ACCOUNT")                             bit=14;;
                                    "ADS_UF_SMARTCARD_REQUIRED")                            bit=13;;
                                    "ADS_UF_TRUSTED_FOR_DELEGATION")                        bit=12;;
                                    "ADS_UF_NOT_DELEGATED")                                 bit=11;;
                                    "ADS_UF_USE_DES_KEY_ONLY")                              bit=10;;
                                    "ADS_UF_DONT_REQUIRE_PREAUTH")                          bit=9;;
                                    "ADS_UF_PASSWORD_EXPIRED")                              bit=8;;
                                    "ADS_UF_TRUSTED_TO_AUTHENTICATE_FOR_DELEGATION")        bit=7;;
                            esac
                            if [ "$bit" != "0" ];then
                                    if [ "$reverse" == "1" ];then
                                            if [ "${binary:$bit:1}" == "1" ];then
                                                    eval RECORD_${c}_${i}=\"False\"
                                            else
                                                    eval RECORD_${c}_${i}=\"True\"
                                            fi
                                    else
                                            if [ "${binary:$bit:1}" == "1" ];then
                                                    eval RECORD_${c}_${i}=\"True\"
                                            else    
                                                    eval RECORD_${c}_${i}=\"False\"
                                            fi
                                    fi
                            fi
                    done
            fi
    fi
@Finkregh

This comment has been minimized.

Copy link

Finkregh commented Jan 21, 2016

i created a fork to create a dokuwiki-table: https://gist.github.com/Finkregh/51feb486cfc257558532

@deepeeess

This comment has been minimized.

Copy link

deepeeess commented May 16, 2017

This is great, thank you.

@bradtumy

This comment has been minimized.

Copy link

bradtumy commented Sep 5, 2017

Great script! Worked perfectly and was exactly what I needed to convert my ldif export to csv.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.