Skip to content

Instantly share code, notes, and snippets.

@cisene
Created October 13, 2023 13:06
Show Gist options
  • Save cisene/31f3c18cfc7ff61571e9e3882c2c2af1 to your computer and use it in GitHub Desktop.
Save cisene/31f3c18cfc7ff61571e9e3882c2c2af1 to your computer and use it in GitHub Desktop.
Description of stale feed refresh logic

Stale feed refresh

Feeds are sometimes deemed to be stale or static, as they have not have had any activity for 18 months, they are flagged with a negative priority flag.

To refresh these every once in a while, something like this could be set up.

API Endpoints

All the regular authentication applies for endpoints with GET (READ) privileges, PATCH would require WRITE-privileges on API-keys as these would be data-altering operations.

Status Description
200 OK, with response body.
204 OK, without response body.
400 FAIL, request was not understood.
401 Unauthorized.

GET /podcasts/stale

GET /podcasts/stale

This call returns all feeds that have been marked stale (priority == -1)

{
  "status": "true",
  "feeds": [
    {
      "id": 75075,
      "podcastGuid": "9b024349-ccf0-5f69-a609-6b82873eab3c",
      "title": "Batman University",
      "url": "https://feeds.theincomparable.com/batmanuniversity",
      "duplicateOf": 75075,
      "lastUpdateTime": 1613394044,
      "lastCrawlTime": 1613394034,
      "lastParseTime": 1613394045,
      "lastGoodHttpStatusTime": 1613394034,
      "lastHttpStatus": 200,
      "contentType": "application/rss+xml"
    }
  ],
  "count": 1,
  "description": "Found matching feed"
}

PATCH /podcasts/priority

Request body

{
  "feeds": [
  	{
  	  "podcastGuid": "9b024349-ccf0-5f69-a609-6b82873eab3c",
      "priority": 1
    }
  ]
}

Responses are 204, 400 and 401 for success , fail and unauthorized.

PATCH /podcasts/dead

Request body

{
  "feeds": [
  	{
  	  "podcastGuid": "9b024349-ccf0-5f69-a609-6b82873eab3c",
      "dead": 1
    }
  ]
}

Responses are 204, 400 and 401 for success , fail and unauthorized.

Logic flow

  1. A client fetches a list of links through API to validate/verify
  2. Iterate over objects
    1. Request HEAD
    2. Request GET
    3. Validate/verify contents of RSS
  3. Update dead/priority lists -> API
graph TB
	%% Fetch /podcasts/stale
	stale((GET /podcasts/stale)) --> list(List of objects)
	
	dead(`dead` ObjectList)
	priority(`priority` Objectlist)

	dead --> PATCHDEAD((PATCH /podcasts/dead))
	priority --> PATCHPRIORITY((PATCH /podcasts/priority))

	subgraph Object Loop
		%% HEAD Feed URL
		list --> HEAD(HEAD feed url)
		HEAD -.-> HEADOK(HEAD Status == 200)
		HEAD -.-> HEADFAIL(HEAD Status != 200)

		HEADOK --> GET
		HEADFAIL -. Update object with dead status .-> dead

		%% GET Feed URL
		GET(GET feed url)
		GET -.-> GETOK(GET Status == 200)
		GET -.-> GETFAIL(GET Status != 200)
		
		GETFAIL -. Update object with dead status .-> dead
		GETOK --> AnalyzeRSS

		subgraph Analyze RSS
            %% Analyse RSS feed
            AnalyzeRSS((Analyse/inspect RSS)) -. Yes .-> AnalyzeRSSRoot(Element `<rss>` exists?)
            AnalyzeRSS((Analyse/inspect RSS)) -. No .-> dead
            
            AnalyzeRSSRoot -. Yes .-> AnalyzeRSSChannel(Element `<channel>` exists?)
            AnalyzeRSSRoot -. No .-> dead

			AnalyzeRSSChannel -. Yes .-> AnalyzeRSSItem(Element `<item>` exists?)
			AnalyzeRSSChannel -. No .-> dead

			AnalyzeRSSItem -. Yes .-> AnalyzeRSSItemCount(Count of `<item>` is >= 1?)
			AnalyzeRSSItem -. No .-> dead
			
            AnalyzeRSSItemCount -. Yes .-> AnalyzeRSSItemEnclosure(`<item>` has element `<encosure>`?)
            AnalyzeRSSItemCount -. No .-> dead

            AnalyzeRSSItemEnclosure -. Yes .-> AnalyzeRSSItemEnclosureType(`<encosure>` has attribute `type`?)
            AnalyzeRSSItemEnclosure -. No .-> dead
            
            AnalyzeRSSItemEnclosureType -. Yes .-> AnalyzeRSSItemEnclosureTypeMimeTypes(`type` is `audio/*` or `video/*`)
            AnalyzeRSSItemEnclosureType -. No .-> dead
            
            AnalyzeRSSItemEnclosureTypeMimeTypes -. Yes .-> AnalyzeRSSItemPubdate(`<item>` attribute pubDate later than `lastUpdateTime`?)
            AnalyzeRSSItemEnclosureTypeMimeTypes -. No .-> dead
            
            AnalyzeRSSItemPubdate -. Yes .-> priority
            AnalyzeRSSItemPubdate -. No .-> dead
		end
	end

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment