Skip to content

Instantly share code, notes, and snippets.

@andrewshell
Created December 7, 2022 04:27
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save andrewshell/8b57f96bf62a1dea9fa1a8137e13951a to your computer and use it in GitHub Desktop.
Save andrewshell/8b57f96bf62a1dea9fa1a8137e13951a to your computer and use it in GitHub Desktop.
Debugging rssCloud on WordPress.com

I have a test application running at test.rsscloud.io on ports 80 and 9876.

The following code works:

curl --location --request POST 'https://brokenriverbooks.com/?rsscloud=notify' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--data-urlencode 'domain=test.rsscloud.io' \
--data-urlencode 'port=80' \
--data-urlencode 'path=/feedupdated-s8759' \
--data-urlencode 'registerProcedure=' \
--data-urlencode 'protocol=http-post' \
--data-urlencode 'url1=https://brokenriverbooks.com/feed/'

However if you change the port to 9876 it fails. For some reason the rssCloud plugin seems unable to hit ports that are not 80.

@andysylvester
Copy link

I think it is poor form to make a comment asking for help in a negative way and then delete it (as the email thread from this gist shows). Please stand behind your words. I would like to work this out, but when you do things like that, I get the feeling you want it all "your way". Since you are the developer of FeedLand, only you can see what is happening in the app beyond the Javascript console. I have supplied a lot of data showing my investigation of the problems with the WordPress rssCloud implementation, but I can only investigate things from "the outside" for the most part.

@andysylvester
Copy link

@scripting -- you are correct, River5 also supports rssCloud. If you know of other aggregators that support rssCloud, I would be interested in testing them against this WordPress issue.

@scripting
Copy link

Andy please I'm just trying to figure out what's going on. I deleted the question because I thought of a much better simpler way to ask it.

@andrewshell
Copy link
Author

Steps I took:

  1. Have a new blog hosted on WordPress.com (ex https://andrewshell.wordpress.com/)
  2. Post a new blog post (ex https://andrewshell.wordpress.com/2022/12/09/test-post-1/)
  3. Add RSS feed of blog to FeedLand (By pressing "+ Feed" button, pasting URL, then OK)
    1. Example feed https://andrewshell.wordpress.com/feed/
  4. See the first blog post in FeedLand
  5. Add a new blog post (ex https://andrewshell.wordpress.com/2022/12/09/test-post-2/)

What I expect to see:

  1. Test post 2 should be listed in FeedLand shortly after posting it

What I saw:

  1. Test post 2 took about an hour to show up in FeedLand

@andysylvester
Copy link

Andrew, that result is consistent with my debug session (https://andysylvester.com/2022/12/04/rss-cloud-support-in-wordpress-com-not-working/). I did some experimenting with River 5 as a debug tool this afternoon. When I ran it on ports 443 and 8080, I was able to see the RSS Cloud plugin site update within a few seconds (https://rsscloud.andysylvester.com/). However the WordPress.com site did not update (https://rsscloud4.wordpress.com/feed/). I will add some test code tomorrow to look at the WordPress.com rssCloud server response. My suspicion is that WordPress.com sites need to have the notification URL on port 80, but I will check this tomorrow.

@scripting
Copy link

Andrew -- I'm looking into your report now.

One tool you have to see how rssCloud is working re this feed is going to the Feed Info page for the feed.

http://feedland.org/?feedurl=https%3A%2F%2Fandrewshell.wordpress.com%2Ffeed%2F

The Last renew number is the last time FeedLand successfully renewed the notification request with the feed's cloud server.

At this time the number is 20 hours.

Just a piece of data.

You can also get FL to check the feed immediately, the conventional way by clicking on the Check Now link

A screen shot.

A report on my experiment in the next comment.

@scripting
Copy link

scripting commented Dec 10, 2022

  1. First, it's hard/impossible for me to step through this in the debugger because the local version of FeedLand that I use for testing is behind a firewall, and can't participate in rssCloud stuff. So I'm testing using the deployed server.

  2. I did find and fix an error. It was getting a 301 response, permanent redirect After fixing, the redirect is taking place.

  3. Another error shows up when renewing andrewshell.wordpress.com -- <notifyResult success='false' msg='No feed for url1.' />

going to look into this now

@scripting
Copy link

I added some debugging code, and had it display the request to the console before it sends it off to wordpress.com.

theRequest == {
    "url": "http://andrewshell.wordpress.com:80/?rsscloud=notify",
    "followAllRedirects": true,
    "maxRedirects": 5,
    "headers": {
        "Accept": "application/json"
    },
    "method": "POST",
    "form": {
        "port": "1670",
        "path": "/feedupdated",
        "url1": "https://andrewshell.wordpress.com/feed/",
        "protocol": "http-post"
    }
}

I think my next step is going to be to assume that it's the "form" that's tripping up the server? Couldn't hurt to use the default of body instead of form. This came up in recent work with the Mastodon API. Back in a bit...

@scripting
Copy link

BTW -- I see another problem. Every time feedland.js launches it might be assigned a different port.

This means that if it should crash and be relaunched, it might be assigned a different port when it is relaunched. Or when I install a new version as I am now doing every few minutes while I test this.

The upshot is that it will be sending notifications to the wrong port until the subscriptions are renewed within no more than 24 hours.

@scripting
Copy link

Another BTW -- this feature went in a few days before the product was feature frozen before release, and there was no one in the test group who had anything remotely like the experience Andrew has with rssCloud, so I did some simple testing and let it go, figuring at some point people who cared about rssCloud would show up and we'd do exactly what we're doing now. 😄

@andrewshell
Copy link
Author

Dave, we determined from our tests against the WordPress plugin that it only works if the aggregator is on ports 80, 443, or 8080 otherwise, it considers it an unsafe URL. So sending port 1670 shouldn't work.

@scripting
Copy link

Here's the new request.

{
    "url": "http://andrewshell.wordpress.com:80/?rsscloud=notify",
    "method": "POST",
    "followAllRedirects": true,
    "maxRedirects": 5,
    "headers": {
        "Content-Type": "application/x-www-form-urlencoded"
    },
    "body": "port=1670&path=%2Ffeedupdated&url1=https%3A%2F%2Fandrewshell.wordpress.com%2Ffeed%2F&protocol=http-post"
}

@scripting
Copy link

scripting commented Dec 10, 2022

And the wordpress server is responding, as before, with <notifyResult success='false' msg='No feed for url1.'>

At this point I don't see anything more I can do, it looks to me like the call is correct, and the error message seems to say it didn't find the URL in the request.

@andrewshell
Copy link
Author

I'll look to see if I can identify anything wrong with the request call. The one thing that I know won't work is using port 1670 because of the unsafe URL filter that had been discussed previously.

@scripting
Copy link

@andrewshell -- aha. okay that explains what's going on.

@josephscott and I worked out this protocol quite a few years ago and he may still be tuned in.

It's easy to imagine another piece of software putting different constraints on what port FeedLand can run on.

The odd thing is that feedland.org is as far as anyone outside the server is concerned running on port 80. I wonder what would happen if I just forced the port to be 80. Do you think they're calling me back by IP address or by a reverse DNS call or..?

I love this bullshit, it's fun (no sarcasm). 😄

@scripting
Copy link

Or force it to be 80 if the domain we're talking to is on wordpress.com! ;-)

Oh the humanity.

@scripting
Copy link

Also if that is the problem they surely could provide a better error message. I thought from the message they returned that it was a syntax error of some kind.

@andrewshell
Copy link
Author

Doing port 80 might work. I did notice that feedland doesn't seem to support the GET verification path (with challenge param) for pleaseNotify with the domain specified. (See http://walkthrough.rsscloud.co/#challengeParameter)

@scripting
Copy link

@andrewshell -- okay that goes on my todo list. :-)

i do have another project that i'm working on that i must move forward today, and i want to think about my next steps here, because i don't want to hack something in here, it has to be maintainable and documented, otherwise it's sure to break in a few months when i forgot what i did here. feedland is a very diverse piece of software and docs and maintainability are huge priorities.

andrew thanks for your help with this. i really enjoy working with you.

@andrewshell
Copy link
Author

@scripting No rush on this, I agree it needs to be done correctly. One thing I found with testing is that the error you're getting might be an issue with request, and might even be deliberate. I haven't dug into its source code yet, but it appears that even when followAllRedirects is true, it doesn't look like the body of the POST is present when the second call is made.

If you make the same call against https://andrewshell.wordpress.com/?rsscloud=notify instead (notice the https) it returns <notifyResult success='false' msg='Error testing notification URL : A valid URL was not provided.' /> which was because of the port number. I think having the form parameter is cleaner and wasn't the culprit.

The unfortunate solution is probably not to use followAllRedirects: true and watch for 301 and 302 status codes, and resend yourself with the location value in the header.

@andrewshell
Copy link
Author

By using this wrapper instead of calling request directly it looks like the call does what you'd expect.

function requestFollowRedirects(theRequest, callback) {
    theRequest.followAllRedirects = false;
    request (theRequest, function (err, response, body) {
        if (parseInt(theRequest.maxRedirects) > 0 && [301, 320].includes(response.statusCode) && response.headers.location != null) {
            const newRequest = Object.assign({}, theRequest, { url: response.headers.location });
            newRequest.maxRedirects--;
            requestFollowRedirects (newRequest, callback);
            }
        callback (err, response, body)
        });
    }

@andysylvester
Copy link

Andrew, thanks for sharing your work! I appreciate it. I am still planning to do some testing today, but it is apparent to me from your investigation that processing of rssCloud feeds from WordPress.com and WordPress.org sites is going to require some additional work beyond what is currently in FeedLand, and probably also in River5.

@scripting
Copy link

@andrewshell -- I've made the change in how FeedLand requests notification, it does its own redirection as illustrated by your example.

I am about to deploy it on feedland.org.

but i don't have enough information about the wordpress server to know what domain name it's using to address my server, for all i know they're using the ip address. if they use anything other than feedland.org to send the notification, port 80 will not work.

i might be willing to do a little customization just for wordpress.com servers, but not unless i know what i'm doing. i'm not willing to even experiment with this on a live server. so we're kind of at an impasse. but at least it should work with your rsscloud server, which i assume does not care what port i'm running on, so there's that. ;-)

@andrewshell
Copy link
Author

The WordPress server follows the rssCloud spec. So since you don't specify the domain in your call, it does a POST to your IP address with the specified port. If you want it to go to feedland.org:80, you'll need to specify the domain name and handle the appropriate validate endpoint where it's a GET request with url and challenge parameters. See http://walkthrough.rsscloud.co/#challengeParameter

@scripting
Copy link

@andrewshell -- thank you. that's the info i was looking for. it's as if i didn't write those words myself. 😄

@josephscott
Copy link

For reference, the WordPress safe request calls being limited to 80, 443, and 8080 happen inside WordPress core at https://core.trac.wordpress.org/browser/trunk/src/wp-includes/http.php#L589 - so this applies broadly across WordPress installs, including WordPress.com.

@scripting
Copy link

I have a new version of feedland.org running, it makes a pleaseNotify request using this request object.

{
    "url": "http://andrewshell.wordpress.com:80/?rsscloud=notify",
    "method": "POST",
    "followAllRedirects": true,
    "maxRedirects": 5,
    "headers": {
        "Content-Type": "application/x-www-form-urlencoded"
    },
    "body": "domain=feedland.org&port=80&path=%2Ffeedupdated&url1=https%3A%2F%2Fandrewshell.wordpress.com%2Ffeed%2F&protocol=http-post"
}

@scripting
Copy link

The upshot of this, as far as I know -- we should now be cool with wordpress.com's servers.

The way to test it, I guess is with a wordpress.com hosted site, try renewing its subscription and see what happens?

Is this correct?

@scripting
Copy link

So I subscribed to this feed.

http://feedland.org/?feedurl=https%3A%2F%2Funberkeley.wordpress.com%2Ffeed%2F

Now I'm going to add a test post and see what happens.

Note I couldn't follow this in the debugger so I don't know if we correctly handled the challenge parameter.

@scripting
Copy link

Houston we have liftoff.

image

@andrewshell
Copy link
Author

I was able to test with my WordPress blog and can concur that it appears to be working now.

Screen Shot 2022-12-12 at 2 02 23 PM

@andysylvester
Copy link

Andrew, I just did a test with my WordPress.com blog (https://rsscloud4.wordpress.com/feed/), and it took 60 minutes for the post to appear on FeedLand. Could you perform another test with your feed (https://andrewshell.wordpress.com/feed/)? I will do some additional tests with my WordPress.com blog and my WordPress.org blog with the RSS Cloud plugin. Based on my test, it seems that FeedLand is still having problems with RSS Cloud for WordPress.com blogs.

@andrewshell
Copy link
Author

@andysylvester I tried it again, and it's still working. It's possible you ran your test before FeedLand resubscribed. I'd test it again if I were you.

@scripting
Copy link

BTW you can see how long it's been since the last cloud renew by going to the Feed Info page for the feed.

http://feedland.org/?feedurl=https%3A%2F%2Frsscloud4.wordpress.com%2Ffeed%2F

In this case it was successfully renewed 13 hours ago.

@josephscott
Copy link

It sounds like things are working now, at least with WordPress.com feeds.

The RSS Cloud plugin itself does still need to be updated. I'll get an update for that out this week.

@josephscott
Copy link

I've released an updated version of the RSSCloud plugin ( 0.5.0 ) - https://wordpress.org/plugins/rsscloud/ - that includes the PHP 8+ fixes and the default scheme of HTTP when none is provided. Let me know if anything else comes up.

@andysylvester
Copy link

@scripting, thanks for the pointer on the cloud renew time, that is helpful.

I did two test posts from my WordPress.com site just now (https://rsscloud4.wordpress.com/feed/), both posts appeared in FeedLand within several seconds, so FeedLand support of rssCloud is working for me now. Thanks @scripting and @andrewshell for your help!

@andysylvester
Copy link

@josephscott - I will go ahead and download the new plugin and give it a test - thanks!

@andysylvester
Copy link

@josephscott - I updated to version 0.5 on my plugin test site (https://rsscloud.andysylvester.com/), made two posts, they have not appeared on FeedLand yet. Is there anything you would like me to check? The FeedLand sub to this site was renewed 2 hrs ago, should I wait for the next renewal to try again?

@josephscott
Copy link

First thing I'd check is to review the error logs for the site, just to make sure nothing went sideways.

Next, is there anything on the site that would interfere with the WordPress cron feature? And does the site get plenty of page views, to give an opportunity for the cron feature to run.

It might also be worthwhile checking on the cron jobs inside the WordPress install, a plugin like https://wordpress.org/plugins/wp-crontrol/ should be sufficient for that.

@andysylvester
Copy link

@josephscott I will look at those items, thanks for the suggestions!

@scripting
Copy link

scripting commented Dec 17, 2022

Good morning! I decided that now that I have a working implementation again and it's fresh in my mind, I should write an example app in JavaScript that implements the aggregator side of rssCloud.

I believe I have it working for notifications via POST, but have hit a snag with the notifications via GET.

My code assumes there will be a challenge parameter, and that it will not be used for a "real" notification of an update, that it will only be used as part of the notification request. But it seems it is being used for notification, and when that happens there is no challenge parameter (which would make sense).

The example I'm testing with is unberkeley.wordpress.com.

I'm going to assume that the description above is what's going on, and be prepared for a call without the challenge parameter.

If you want to see the actual code it's in this file.

https://github.com/scripting/reallysimple/blob/main/demos/clouddemo/clouddemo.js

BTW, in reviewing the walkthrough -- I desperately want to rewrite it. It has become very dated, and the links to examples are broken. However I'm not sure I will have the time for that, all the more reason it's important to have a good working example out there.

@scripting
Copy link

Update: As I'm debugging it further -- the challenge parameter does appear to be there even for real pings, not test pings. Still looking into it.

@scripting
Copy link

It only took me an hour to find a missing break; statement.

I've been programming for 800 years and I still make beginner's mistakes.

Never mind. ;-)

@andysylvester
Copy link

I couldn't leave a comment on the reallySimple repo for clouddemo.js (it appears to be locked), so I am leaving a comment here with my report. I was able to run the app on port 443 and successfully register a WordPress.com site and a WordPress.org site using version 0.5.0 of the WordPress RSS Cloud plugin. When I ran my test script on the WordPress.org site, it worked, but I figured out that the WordPress.org site is caching the RSS feed, which explains why feed updates were not seen by FeedLand.

@josephscott - I have looked at this Stack Exchange post for ideas on disabling caching of WordPress feeds, if you have any better ideas, I am interested.

The code for clouddemo.js looks good, I am pleased that it could run on port 443, since I already had an app running on port 80 on my server and did not want to change it.

@josephscott
Copy link

@andysylvester are your referring to the RSS blog/widget caching the results of a feed? That would be the other end of this, you would need something to trigger an invalidation of the cached feed.

@andysylvester
Copy link

@josephscott I am not sure what you mean by "RSS blog/widget". My understanding is that WordPress uses SimplePie to generate the RSS feed of a WordPress website/blog. It appears to be common knowledge that WordPress itself caches the RSS feed of the site (see https://amandagiles.com/blog/code-snippets/updating-wordpress-rss-feed-cache-time/). To me, this is a problem for using rssCloud with WordPress, since most feed readers will not display new updates from the site if the feed it retrieves has not changed, even if it receives a response from an rssCloud server that the site has updated. In my opinion, for rssCloud use, the caching should be turned off or reduced to 5-10 seconds.

Here is my anecdotal evidence supporting this: I made posts on my WordPress.org site using version 0.5.0 of the RSS Cloud plugin. I made 3 posts and received three responses from the plugin (see this gist). However, the posts did not appear in FeedLand. I checked the URL of the feed in my browser (https://rsscloud.andysylvester.com/feed/), it was updated. I then did a curl command of the feed, and it had not updated. I then went to the Dashboard for my site, went to Settings/Readings, and changed the value of "Syndication feeds show the most recent x items" from 10 to 5. After doing that, the curl command returned the feed I was seeing in my browser, and eventually the posts showed up in FeedLand.

Following up on your last sentence, I think that the RSS Cloud plugin should trigger an invalidation of the cached feed so that feed readers will get the most recent version of the feed. Otherwise, the RSS Cloud plugin notification does not provide any benefit.

@josephscott
Copy link

Sorry, my previous comment was supposed to be "RSS block/widget".

In terms of the RSS feed updating, I've run a few tests on a stock out of the box WordPress site and didn't see any problems with the RSS feed updating right away.

What you described with the feed being updated when you viewed it with the browser but not updated when done with curl sounds like it could be a caching difference based on cookies. Were you logged into the site with the browser you used to check the feed?

I went back to the fresh stock WordPress install for testing and still didn't have a problem with the feed updating when viewed logged out or logged in, so I don't think this is sufficient to explain what you saw either.

Are there any other plugins on the site you are using to test? Can you try it with a fresh WordPress install with no plugins and check to see if the RSS feed updates as expected.

@scripting
Copy link

@josephscott @andysylvester --

Once I got the FeedLand implementation working with WordPress and @andrewshell's server, I created a very simple demo app that just does the rssCloud functionality in FeedLand anyone can test with our implementation, and see exactly what's going on. MIT License.

https://github.com/scripting/reallysimple/tree/main/demos/clouddemo

It also provides a reference implementation for people just getting started.

At some point I need to rewrite the walkthrough. It's not very well done imho (I am the author).

@andysylvester
Copy link

@josephscott - I was logged into the site, so I will check on that. I really appreciate your feedback, and will work on testing today.

@andysylvester
Copy link

@scripting - I was already aware of your implementation (see this link to comment above), so I assume you are really leaving your comment for the benefit of @josephscott. I used your app to help figure out my current issue with the RSS Cloud plugin. Since it appears you have removed your block of my account on the reallySimple repo, I will post further feedback on your reference app on that repo in the future.

@andysylvester
Copy link

Well, I see it didn't take long to restore the block of my account...

@andysylvester
Copy link

@josephscott - Yesterday, per your suggestion, I created a new WordPress site (https://scott2.andysylvester.com/) and disabled all plugins, then installed version 0.5.0 of the RSS Cloud plugin, and was able to successfully see posts appear within several seconds on FeedLand, which would indicate that the plugin is working. I also checked the feed in the browser and using curl, and both contained the correct content. This morning, I decided to do additional tests using my test app and Dave Winer's test app. Both apps were able to receive the notification from new posts. However, posts did not appear in FeedLand. I also checked the feed in the browser and using curl, and both contained the correct content.

When I looked at the output of my app, which uses the readFeed function within the NPM reallySimple package to get the feed, the first title in the feed it read was for test post 8, although I had created a new test post. This behavior continued even though I made several additional posts. In the posts, I captured info on each state of testing. To me, if FeedLand was not seeing any changes in the feed, I can totally understand that it would not display anything from the feed. As I have commented before, this seems like a caching problem, but I am not sure where to look.

I do not fully understand what is happening here, but this behavior has now occurred on two different WordPress sites using the latest RSS Cloud plugin. I would like to work this out, because WordPress is the major creator of rssCloud-enabled feeds, and I want to use rssCloud and WordPress in a new project. @josephscott, I am willing to keep posting to this thread, or in some other forum to work this out, I appreciate the feedback you have given me, and want to run this problem down if I can.

@josephscott
Copy link

From what you are describing, I'm not sure that this is directly related to the RSS Cloud plugin. You mentioned that it worked out of the box, with the feed showing the latest data and pings going out to FeedLand.

To track this down I think you are going to need more data. I'd recommend adding logging to every step, that what you can confirm when a step went missing that you were expecting to happen.

@andysylvester
Copy link

@josephscott - thanks for getting back with me. I did see it work "out of the box", but then saw this caching issue when I was running some supplemental test scripts. I will add some logging and report back on what I find.

@andysylvester
Copy link

@josephscott - I apologize for taking so long to get back to you on this thread. I found that my hosting provider (Bluehost) had a site-by-site caching setting that I was not aware of. I turned caching off, and I also uncommented line 12 in rsscloud.php within the WordPress RSS Cloud plugin. After taking those two actions, I was able to see WordPress posts within a few seconds on FeedLand and a separate tool I created. More info is available here: https://andysylvester.com/2023/02/23/solved-my-problem-with-wordpress-caching/. Thanks for your helpful suggestions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment