Skip to content

Instantly share code, notes, and snippets.

@joaf123
Last active August 28, 2024 18:03
Show Gist options
  • Save joaf123/c2e1cd3d5990694368ace053db9b5b68 to your computer and use it in GitHub Desktop.
Save joaf123/c2e1cd3d5990694368ace053db9b5b68 to your computer and use it in GitHub Desktop.
Bulk Editing docx Archives Without Losing Metadata Using Powershell

Unpack & Repack docx files for safe & easy bulk editing!

Requires 7z CLI in OS Path!

Unpacks or Repacks docx files so you can work with their XML contents.

Internal metadata/props/refs stored in the files are usually overwritten/lost, when attempting to edit such docx documents with Microsoft Word

Unpacking and repacking the docx files (just a glorified zip archive) allows one to make bulk edits to the docx files by for example search and replacing string content or images stored within the document. Updating the docx files like this also has the benefit of maintaining any internal metadata, unlike Microsoft Word.

Usage

Unpacking

Unpack-Docx #Unpack docx files of current folder only to folders in the same location

# OR

Unpack-Docx -r #Recursively unpack docx files

Repacking

Repack-Docx #Repack folders back to docx files

# OR

Repack-Docx -r #Recursively repack all folders back to docx files

Source

#====== Unpack Docx Archives ==============================================================================================================================================
function Unpack-Docx-Archives {
  7z x *.docx -o*
}

function Unpack-Docx-Archives-Recursive{
  $initpath = Get-Location
  foreach ($folder in Get-ChildItem) {
    if ($folder.Attributes -eq "Directory") {
      Set-Location $folder.FullName
      7z x *.docx -o*
    }
  }
  Set-Location $initpath
  7z x *.docx -o*
}

function Unpack-Docx {
  param (
    [Parameter(Mandatory=$false, ValueFromPipeline = $true)]
    [switch]$r
  )

  if ($r) {
    Unpack-Docx-Archives-Recursive
  } else {
    Unpack-Docx-Archives
  }
}
#===========================================================================================================================================================================

#====== Repack Docx Archives ===============================================================================================================================================
function Repack-Docx-Archives {
  $initpath = Get-Location

  foreach ($folder in Get-ChildItem -Directory) {
    $archivePath = Join-Path -Path $initpath -ChildPath ($folder.Name + '.docx')
    Set-Location $folder.FullName
    if (Test-Path $archivePath) {
      Remove-Item -Path $archivePath
      7z a -tzip $archivePath *
    }


    if (Test-Path $archivePath) {
        Set-Location $initpath
        [IO.Directory]::Delete($folder.FullName, $true)
    }
  }

  Set-Location $initpath
}

function Repack-Docx-Archives-Recursive {
  RepackAllDocxArchives

  $initpath = Get-Location

  foreach ($folder in Get-ChildItem) {
    if ($folder.Attributes -eq "Directory") {
      Set-Location $folder.FullName
      foreach ($subFolder in Get-ChildItem) {
        if ($subFolder.Attributes -eq "Directory") {
          Set-Location $subFolder.FullName
          $archivePath = $subFolder.Name + '.docx'
          if (Test-Path $archivePath) {
            Remove-Item -Path $archivePath
          }
          7z a -tzip $archivePath *
          Get-ChildItem *.docx -Recurse | Move-Item -force -Destination $folder.FullName
          $folderToDelte = $folder.FullName + '\' + $subFolder.Name
          Set-Location $folder.FullName
          [IO.Directory]::Delete($folderToDelte, $true)
        }
      }
    }
  }
  Set-Location $initpath
}

function Repack-Docx {
  param (
    [Parameter(Mandatory=$false, ValueFromPipeline = $true)]
    [switch]$r
  )

  if ($r) {
    Repack-Docx-Archives-Recursive
  } else {
    Repack-Docx-Archives
  }
}
#===========================================================================================================================================================================
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment