Skip to content

Instantly share code, notes, and snippets.

@drdaxxy
Last active January 28, 2022 20:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save drdaxxy/b7731fb4217a56604956bcaa45641648 to your computer and use it in GitHub Desktop.
Save drdaxxy/b7731fb4217a56604956bcaa45641648 to your computer and use it in GitHub Desktop.

Grabbing TechnologyGuide forums via Tapatalk

Most of this applies to any forum using a (sufficiently up-to-date) version of the Tapatalk plugin. Some things may be specific to XenForo 1 or TechnologyGuide's setup. Users and forum administrators may also want to export private data (PMs, restricted forums, edit history...) which is out of scope here.

  • Error responses: 200 OK with body like:
result = /* repsonse body */ {
  result: false,
  result_text: "Need valid forum id!",
  error: "Need valid forum id!"
}
  • WAF-side errors: 403 Forbidden with HTML content, marker <title>Request Rejected</title>
  • Should we need authentication for anything, xf_session cookie from HTML frontend is sufficient IIRC.
  • No limits on pagination on the app side, server probably times out on too big pages.
  • A lot of scalar properties appear in multiple views for one object. Only crawl-relevant properties listed below.

Get all forums

GET /mobiquo/tapatalk.php?method_name=get_forum
forums = /* repsonse body */ [
  {
    forum_id: "1007",
    url: "",          // empty string, unless this is an external link
    child: []         // subforums, same schema as root level forums array, recursive
  }
]

Then for each forum in forums and recursively forum.child (modes are separately paginated):

Get all threads in forum

Get normal threads, stickies, announcements, respectively. Paginated separately, same schema.

GET /mobiquo/tapatalk.php?method_name=get_topic&forumId=<forum_id>&page=<page>&perPage=<per_page>
GET /mobiquo/tapatalk.php?method_name=get_topic&forumId=<forum_id>&page=<page>&perPage=<per_page>&mode=TOP // stickies
GET /mobiquo/tapatalk.php?method_name=get_topic&forumId=<forum_id>&page=<page>&perPage=<per_page>&mode=ANN // announcements
forum = /* repsonse body */ {
  total_topic_num: 597,
  topics: [
    {
      total_post_num: 1234,
      topic_id: "256",
      topic_author_id: "4828",
      post_author_id: "4828", first_post_author_id: "4828", last_reply_author_id: "2883",
      icon_url: "http://...", first_post_icon_url: "http://...", last_reply_icon_url: "http://..."  // OP and last poster avatars
    }
  ]
}

Then for each topic in each page (1 to total_topic_num / per_page):

Get all posts in thread

TODO: Check what exactly returnHtml transforms. Maybe returnHtml=1 yields higher fidelity?

GET /mobiquo/tapatalk.php?method_name=get_thread&topicId=<topic_id>&returnHtml=0&page=<page>&perPage=<per_page>
thread = /* repsonse body */ {
  total_post_num: 1234,
  ...op_and_last_reply_properties,
  posts: [
    {
      post_id: "9001",
      post_author_id: "4828", icon_url: "http://...",     // poster ID and avatar
      editor_id: "4828",                                  // only present if edited
      inlineattachments: [                                // attachments included in the post with an [IMG] tag
        {
          attachment_id: "5555",
          filesize: 829385,                               // may want to extract these alongside links, to plan out attachment crawl
          thumbnail_url: "http://...", url: "http://..."  // only thumbnails are guest-viewable
        }
      ],
      attachments: [],                                    // same object schema as inlineattachments
      likes_info: [{ user_id: "8283" }],
      thanks_info: [],                                    // probably same as likes? I'm not sure this forum, or XF1 Tapatalk in general, uses this
      post_content: "X<br /><b>[url=http://x]Y[/url]</b>" // mixed HTML and BBCode, maybe parse links and tags (quote, img at least) to find more resources?
    }
  ]
}

Extra

Users

Attachments

  • Thumbnails openly available, originals require login
  • Nothing to do with tapatalk API from that point

Problems / missing

  • Assets not covered here (emotes?)
  • BBCode is not original post source, but adapted for app display
    • Likely not fully reversible, analyze plugin source for details
  • What about profile posts? (NBR has them, not sure about others, haven't checked whether API provides them yet)
  • Media gallery is a third-party addon, not covered by API

Revisions

  • v2: examined returnHtml parameter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment