Skip to content

Instantly share code, notes, and snippets.

@Anexo
Last active November 19, 2020 11:36
Show Gist options
  • Save Anexo/106c25dc4a99843936562ab71c5eef18 to your computer and use it in GitHub Desktop.
Save Anexo/106c25dc4a99843936562ab71c5eef18 to your computer and use it in GitHub Desktop.
Read in a docx, save into variable and count the words. Word 2007 format only.
<?php
require 'vendor/autoload.php';
$source = 'example.docx';
$phpword = \PhpOffice\PhpWord\IOFactory::load($source);
$sections = $phpword->getSections();
$uploadedText = '';
foreach ($sections as $section) {
$elements = $section->getElements();
foreach ($elements as $element) {
if (get_class($element) === 'PhpOffice\PhpWord\Element\Text') {
$uploadedText .= $element->getText();
$uploadedText .= ' ';
} else if (get_class($element) === 'PhpOffice\PhpWord\Element\TextRun') {
$textRunElements = $element->getElements();
foreach ($textRunElements as $textRunElement) {
$uploadedText .= $textRunElement->getText();
$uploadedText .= ' ';
}
} else if (get_class($element) === 'PhpOffice\PhpWord\Element\TextBreak') {
$uploadedText .= ' ';
} else {
throw new Exception('Unknown class type ' . get_class($e));
}
}
}
$uploadedText = str_replace('&nbsp;',"", $uploadedText);
$uploadedText = str_replace('•',"",$uploadedText);
$uploadedText = preg_split('/\s+/', $uploadedText);
$numberWords = count($uploadedText);
@parsibox
Copy link

very good
thanks

@bobbyaxe74
Copy link

Excellent code try to define $uploadedText=''; somewhere around line 10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment