Skip to content

Instantly share code, notes, and snippets.

@amoutiers
Created January 10, 2018 17:29
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save amoutiers/991da535cc2afef8c1bb0c56326a378e to your computer and use it in GitHub Desktop.
Save amoutiers/991da535cc2afef8c1bb0c56326a378e to your computer and use it in GitHub Desktop.
A basic php script to migrate from Drupal to Hugo, needs "html2markdown" installed on the server
<?php
define('DRUPAL_ROOT', __DIR__);
include_once(DRUPAL_ROOT . '/includes/bootstrap.inc');
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);
$nids = db_query('SELECT DISTINCT(nid) FROM {node}')
->fetchCol();
$nodes = node_load_multiple($nids);
foreach($nodes as $node) {
$front_matter = array(
'title' => $node->title,
'date' => date('c', $node->created),
'lastmod' => date('c', $node->changed),
'draft' => 'false',
);
if (count($node->taxonomy_vocabulary_2[LANGUAGE_NONE])) {
$tags = taxonomy_term_load_multiple(
array_column(
$node->taxonomy_vocabulary_2[LANGUAGE_NONE],
'tid'
)
);
$front_matter['tags'] = array_column($tags, 'name');
}
if (count($node->taxonomy_vocabulary_1[LANGUAGE_NONE])) {
$cat = taxonomy_term_load_multiple(
array_column(
$node->taxonomy_vocabulary_1[LANGUAGE_NONE],
'tid'
)
);
$front_matter['categories'] = array_column($cat, 'name');
}
$path = drupal_get_path_alias('node/'.$node->nid);
if ($path != 'node/'.$node->nid) {
$front_matter['url'] = '/'.$path;
$content_dir = explode('/', $path);
$content_dir = end($content_dir);
}
else {
$content_dir = $node->nid;
}
$content = json_encode(
$front_matter,
JSON_PRETTY_PRINT|JSON_UNESCAPED_SLASHES|JSON_UNESCAPED_UNICODE
);
$content .= "\n\n";
$tmp_file = '/tmp/node.html';
file_put_contents($tmp_file, $node->body['fr'][0]['value']);
$body = shell_exec('html2markdown '.$tmp_file);
unlink($tmp_file);
//$body = $node->body['fr'][0]['value'];
$content .= $body;
$dir_name = '/tmp/hugo/content/'.$node->type.'/'.$content_dir;
mkdir($dir_name, 0777, true);
file_put_contents($dir_name.'/index.md', $content);
}
@rickysarraf
Copy link

I have been running this script to migrate my drupal 7 installation to Hugo. The script does copy the content but the body of the content remains empty. As I see it, I think $node->body['fr'][0]['value'] is coming up empty for me. Is there someway I can debug this further ? Please.

@rickysarraf
Copy link

Here's the exact error:

rrs@lenovo:~/rrs-home/public_html/drush/rhut_staging$ php ./drupal7_to_hugo.php 
PHP Notice:  Undefined index: REMOTE_ADDR in /media/SSHD/rrs-home/public_html/drush/rhut_staging/includes/bootstrap.inc on line 3208

Notice: Undefined index: REMOTE_ADDR in /media/SSHD/rrs-home/public_html/drush/rhut_staging/includes/bootstrap.inc on line 3208
PHP Notice:  Undefined index: REMOTE_ADDR in /media/SSHD/rrs-home/public_html/drush/rhut_staging/includes/bootstrap.inc on line 3208

@rickysarraf
Copy link

Here's the exact error:

rrs@lenovo:~/rrs-home/public_html/drush/rhut_staging$ php ./drupal7_to_hugo.php 
PHP Notice:  Undefined index: REMOTE_ADDR in /media/SSHD/rrs-home/public_html/drush/rhut_staging/includes/bootstrap.inc on line 3208

Notice: Undefined index: REMOTE_ADDR in /media/SSHD/rrs-home/public_html/drush/rhut_staging/includes/bootstrap.inc on line 3208
PHP Notice:  Undefined index: REMOTE_ADDR in /media/SSHD/rrs-home/public_html/drush/rhut_staging/includes/bootstrap.inc on line 3208

And I think this warning can be safely ignored. It is just coming from the drupal site name, which needs to resolve. Once I fix my dns entries, I avoided that warning message.

@rickysarraf
Copy link

I put some more print statements to just show what exact problem I'm running into:

rrs@lenovo:~/rrs-home/public_html/drush/rhut_staging$ php ./drupal7_to_hugo.php 
{
    "title": "RESEARCHUT » RESEARCHUT",
    "date": "2011-01-22T13:44:06-05:00",
    "lastmod": "2011-01-29T04:34:05-05:00",
    "draft": "false",
    "categories": [
        "General"
    ],
    "url": "/blog/rrs"
}



{
    "title": "One week with the move",
    "date": "2010-12-03T15:45:00-05:00",
    "lastmod": "2011-01-22T13:44:10-05:00",
    "draft": "false"
}



{
    "title": "apt-offline - 1.0",
    "date": "2010-11-08T09:55:00-05:00",
    "lastmod": "2011-01-29T10:16:08-05:00",
    "draft": "false",
    "tags": [
        "apt-offline",
        "offline package manager",
        "apt-offline GUI"
    ],
    "categories": [
        "Debian-Blog",
        "Programming"
    ]
}



{
    "title": "Pomfret and Sardine",
    "date": "2010-11-07T14:05:00-05:00",
    "lastmod": "2011-01-22T13:44:10-05:00",
    "draft": "false"
}

@rickysarraf
Copy link

I finally figured it out. I was lazy to look up the php manual but as is the case always, I need to eat my own dog food.
The problem was with:

Array
(
    [und] => Array
        (
            [0] => Array
                (
                    [value] => <p>sg3-utils, version 1.44, was recently <a href="https://tracker.debian.org/pkg/sg3-utils">uploaded</a> to Debian. This new upstream release has happened almost 2.5 years after the last release. One important feature to emphasize about is some support for NVMe disks, which are now getting more common on latest range of laptops.</p>

The language part was reported und for whatever reason. And I don't want to investigate it. After I marked it accordingly, I have been able to import all my data from Drupal 7 to Hugo.

Thank you so much for writing this script. Me, A web n00b, was able to use it and migrate away from the painful platform that Drupal has become now.

Now, next in my list is, to look for a self-hosted commenting system.

@adrinux
Copy link

adrinux commented Feb 12, 2021

Same problem as rickysarraf - language was undefined on nodes, plus script hard codes 'fr' on line 60. Switch 'fr' to 'und' and node content is exported where possible.

Was also migrating a site that was previously hosted in Aegir. Needed to run the script with drush 'drush php-script drupal7_to_hugo.php'.

With those changes it worked well. Thank you @amoutiers for sharing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment