Instantly share code, notes, and snippets.

Embed
What would you like to do?
A basic php script to migrate from Drupal to Hugo, needs "html2markdown" installed on the server
<?php
define('DRUPAL_ROOT', __DIR__);
include_once(DRUPAL_ROOT . '/includes/bootstrap.inc');
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);
$nids = db_query('SELECT DISTINCT(nid) FROM {node}')
->fetchCol();
$nodes = node_load_multiple($nids);
foreach($nodes as $node) {
$front_matter = array(
'title' => $node->title,
'date' => date('c', $node->created),
'lastmod' => date('c', $node->changed),
'draft' => 'false',
);
if (count($node->taxonomy_vocabulary_2[LANGUAGE_NONE])) {
$tags = taxonomy_term_load_multiple(
array_column(
$node->taxonomy_vocabulary_2[LANGUAGE_NONE],
'tid'
)
);
$front_matter['tags'] = array_column($tags, 'name');
}
if (count($node->taxonomy_vocabulary_1[LANGUAGE_NONE])) {
$cat = taxonomy_term_load_multiple(
array_column(
$node->taxonomy_vocabulary_1[LANGUAGE_NONE],
'tid'
)
);
$front_matter['categories'] = array_column($cat, 'name');
}
$path = drupal_get_path_alias('node/'.$node->nid);
if ($path != 'node/'.$node->nid) {
$front_matter['url'] = '/'.$path;
$content_dir = explode('/', $path);
$content_dir = end($content_dir);
}
else {
$content_dir = $node->nid;
}
$content = json_encode(
$front_matter,
JSON_PRETTY_PRINT|JSON_UNESCAPED_SLASHES|JSON_UNESCAPED_UNICODE
);
$content .= "\n\n";
$tmp_file = '/tmp/node.html';
file_put_contents($tmp_file, $node->body['fr'][0]['value']);
$body = shell_exec('html2markdown '.$tmp_file);
unlink($tmp_file);
//$body = $node->body['fr'][0]['value'];
$content .= $body;
$dir_name = '/tmp/hugo/content/'.$node->type.'/'.$content_dir;
mkdir($dir_name, 0777, true);
file_put_contents($dir_name.'/index.md', $content);
}
@rickysarraf

This comment has been minimized.

rickysarraf commented Nov 5, 2018

I have been running this script to migrate my drupal 7 installation to Hugo. The script does copy the content but the body of the content remains empty. As I see it, I think $node->body['fr'][0]['value'] is coming up empty for me. Is there someway I can debug this further ? Please.

@rickysarraf

This comment has been minimized.

rickysarraf commented Nov 5, 2018

Here's the exact error:

rrs@lenovo:~/rrs-home/public_html/drush/rhut_staging$ php ./drupal7_to_hugo.php 
PHP Notice:  Undefined index: REMOTE_ADDR in /media/SSHD/rrs-home/public_html/drush/rhut_staging/includes/bootstrap.inc on line 3208

Notice: Undefined index: REMOTE_ADDR in /media/SSHD/rrs-home/public_html/drush/rhut_staging/includes/bootstrap.inc on line 3208
PHP Notice:  Undefined index: REMOTE_ADDR in /media/SSHD/rrs-home/public_html/drush/rhut_staging/includes/bootstrap.inc on line 3208
@rickysarraf

This comment has been minimized.

rickysarraf commented Nov 5, 2018

Here's the exact error:

rrs@lenovo:~/rrs-home/public_html/drush/rhut_staging$ php ./drupal7_to_hugo.php 
PHP Notice:  Undefined index: REMOTE_ADDR in /media/SSHD/rrs-home/public_html/drush/rhut_staging/includes/bootstrap.inc on line 3208

Notice: Undefined index: REMOTE_ADDR in /media/SSHD/rrs-home/public_html/drush/rhut_staging/includes/bootstrap.inc on line 3208
PHP Notice:  Undefined index: REMOTE_ADDR in /media/SSHD/rrs-home/public_html/drush/rhut_staging/includes/bootstrap.inc on line 3208

And I think this warning can be safely ignored. It is just coming from the drupal site name, which needs to resolve. Once I fix my dns entries, I avoided that warning message.

@rickysarraf

This comment has been minimized.

rickysarraf commented Nov 5, 2018

I put some more print statements to just show what exact problem I'm running into:

rrs@lenovo:~/rrs-home/public_html/drush/rhut_staging$ php ./drupal7_to_hugo.php 
{
    "title": "RESEARCHUT » RESEARCHUT",
    "date": "2011-01-22T13:44:06-05:00",
    "lastmod": "2011-01-29T04:34:05-05:00",
    "draft": "false",
    "categories": [
        "General"
    ],
    "url": "/blog/rrs"
}



{
    "title": "One week with the move",
    "date": "2010-12-03T15:45:00-05:00",
    "lastmod": "2011-01-22T13:44:10-05:00",
    "draft": "false"
}



{
    "title": "apt-offline - 1.0",
    "date": "2010-11-08T09:55:00-05:00",
    "lastmod": "2011-01-29T10:16:08-05:00",
    "draft": "false",
    "tags": [
        "apt-offline",
        "offline package manager",
        "apt-offline GUI"
    ],
    "categories": [
        "Debian-Blog",
        "Programming"
    ]
}



{
    "title": "Pomfret and Sardine",
    "date": "2010-11-07T14:05:00-05:00",
    "lastmod": "2011-01-22T13:44:10-05:00",
    "draft": "false"
}
@rickysarraf

This comment has been minimized.

rickysarraf commented Nov 5, 2018

I finally figured it out. I was lazy to look up the php manual but as is the case always, I need to eat my own dog food.
The problem was with:

Array
(
    [und] => Array
        (
            [0] => Array
                (
                    [value] => <p>sg3-utils, version 1.44, was recently <a href="https://tracker.debian.org/pkg/sg3-utils">uploaded</a> to Debian. This new upstream release has happened almost 2.5 years after the last release. One important feature to emphasize about is some support for NVMe disks, which are now getting more common on latest range of laptops.</p>

The language part was reported und for whatever reason. And I don't want to investigate it. After I marked it accordingly, I have been able to import all my data from Drupal 7 to Hugo.

Thank you so much for writing this script. Me, A web n00b, was able to use it and migrate away from the painful platform that Drupal has become now.

Now, next in my list is, to look for a self-hosted commenting system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment