This thread soon will become the longest one in whole mxs sdk section
page_crawl_posts_step = 15
-- foreach threadData in CSV do
-- sleep for a reasonable amount of time not to disturb cgs servers with lots of requests
url = @"https://forums.cgsociety.org/t/msx-editor-access/2049420"
thread_id = (tmp = FilterString url "/"; tmp[tmp.count])
savepath = @"C:\somefolder" + "/" + thread_id
postcount = 10
if not doesFileExist savepath do makeDir savepath
if postcount <= 20 then
dragAndDrop.DownloadUrlToDisk url (savepath + "/" + (thread_id as string) + ".html") 0
for i=1 to postcount by page_crawl_posts_step do
dragAndDrop.DownloadUrlToDisk (url + "/" + i as string) (savepath + "/" + (thread_id as string) + "-" + i as string + ".html") 0
But it is just raw data not viewable in a browser. And also it seems like you can’t get more than 20 posts per request.
I’d also prefer to have entire thread in a separate file, but this is much more complicated since it will require either to combine several saved files in one pragmatically or use some headless browser to scroll-up-down each thread from top to bottom before saving it to disk.
Saving the content we did for personal use shouldn’t be forbidden I guess. Why would search engine web crawlers be allowed to do so?