|
Post by Killix on Oct 11, 2021 21:18:10 GMT -5
Those authors can still be credited without using full names.
If someone is credited as Joe John Smith in the NT, the archive can credit them as "Joe Smith" or just "Joe". No identifiable information is shared especially if it's just a first name with no links.
|
|
|
Post by Duke Pikachu on Oct 11, 2021 22:47:04 GMT -5
Ooh yeah, that does sound like a concern. I'm not familiar with how NT submissions worked back in the day, but I've seen what you're talking about at least with some of the oldest entries' credits. I'm having trouble thinking of any workarounds for that right now, of how to still give credit to entries without disclosing personal information that was originally given as a child, but hopefully there's a good solution to that that I'm not seeing. At the moment, for entries using a real name (or what looks to be a real name), probably best to shorten them to their abbreviation (so "John Doe" would just be "J. D."). At that point it would be up to the original author to contact us if they want to be credited properly. Luckily it's an easy test to know whether they're lying or not. At worst they used a pseudonym, forgot what the pseudonym was, and maybe would then have to do some investigation work (though I think the likely result would be unable to clarify who wrote the entry so it's left as is).
|
|
|
Post by RielCZ on Oct 12, 2021 21:01:45 GMT -5
Ooh, yeah this is really cool! And a much better way to gather up all the NT content without taking countless man hours xD I think this script is great to gather all the basic content, and then from there we can manually match titles and authors to the appropriate files and do whatever else we may need/want for organization. I love it \o/ [...] People might want those two products to stay attached, but that might also come down to manual saving unless we did end up saving all thumbnails (if I'm understanding that correctly). If we want, I'd be happy to keep collecting thumbnails while issues are saved, since that's not too intensive a process on its own. Ah yeah, pre-150 issues are, uh, rather special in set up I've discovered xD [...] What impacted me most is I found out that a majority of comics aren't singular images, but rather are spliced into 2-3 (or more) images. Sometimes they're divided by panels, sometimes panels are cut in half between the images. It's led me to either screenshot the comic with the snipping tool or save the multiple images if the comic requires scrolling read. That might mean a manual effort of splicing comics back together after they're saved, but that'd thankfully be doable I feel. [...] Yay, thank you! Glad my script seems to be helpful. In terms of matching titles and authors -- this should be able to be done with another script. Depending on how the front-end of the archive is coded (and someone with more background in UX would have a better idea), we could potentially have a "base template" and populate the author, story, etc. as fields within the HTML.
Yes, you understand correctly that "special" NT thumbnails would have to manually be sorted from "generic" thumbnails, even if I were to mass download them all; however, some heuristics could be applied based on filename that could "pre-sort" the images for easier manual review. Interestingly, comics in issues 150-157 also followed a "spliced" pattern with multiple images per page! So, I ended up modifying the script to just save every image in the main content on those pages. Another thing to note is that some comics doesn't seem to be on the server anymore, including almost all from Issue 756. As an update on my end, I have every NT from 150 to present now downloaded. This also includes SWF files, but I don't have the means to test whether they work. I also found a simple script for testing whether image files were successfully downloaded, and will re-download ones that failed (e.g. bit/truncation errors). The oldest issue I know of (with clickable links!) is HERE, and one can go from 2 in the link until 11 and still get content. Looking into my records, I seem to have them already downloaded for posterity -- probably with some other script I made once. So that still means I need to download pre-150 and the "stone age". I also have Storytelling, Poetry, and the Caption Competition saved -- though I will need to validate them. (I know for sure a few CCs were downloaded incorrectly, but because they required being logged in to view?) I will also start downloading the NT thumbnails in the near future -- thanks all for the input on that! If no one else is already doing them, I can look into doing the AG and BC entries... or any/all other spotlights, really. But I can't start until tomorrow evening. Edit: Home. Have started saving the AG. I looked into the AG and the BC and have determined that (I think) the AG is much easier to save, at least in terms of modifying my script. For technical details-- To get BC results, one has to input values to a POST webform to get a response. I.e., it's not just a matter of changing a URL -- which would be super simple and the way the other spotlights I have archived thus far are coded.
I could write a macro (maybe?), or maybe I can look into making automatic POST requests. Regardless, it would require a fair bit more research/experimenting on my part to get going. So, perhaps you could work on the BC instead, for now? About real names, and in some cases real email addresses -- probably another script can be written to match UN's that have spaces or email addresses, and flag them for manual review. Also, what other spotlights might we even want to try to archive? Random Contest? UL? Pet/Petpet? That Halloween Costume Contest from a couple years back?
Sorry for the admittedly somewhat technical post, but just updating where I am with things.
|
|
|
Post by Twillie on Oct 12, 2021 21:24:28 GMT -5
Awesome to hear on the saving progress, and good to know when it comes to author names and titles! :3 And yeah, I figured regarding thumbnails, so I'll say again that I'll be happy to manually save or sort through custom thumbnails however we'd like to go about it! Interestingly, comics in issues 150-157 also followed a "spliced" pattern with multiple images per page! So, I ended up modifying the script to just save every image in the main content on those pages. Another thing to note is that some comics doesn't seem to be on the server anymore, including almost all from Issue 756. Ah yeah, the Issue 756 comics are unfortunately lost I fear </3 They were glitched from the issue's release and never fully fixed, and the last time an editor was asked about them they said that they're unsure where the comics are and are unable to fix any broken ones now. I also ran into the random missing comic in other issues and wondered what happened to them, but then came to a realization when trying to save neo_tomi's stuff. Many of his comics are gone, but he also made a lot of flash comics, so I have a feeling that's the case for those other missing ones.
Oh my, NT content went back even further than I thought =0 Now I feel compelled to go through 2-11 to see if there's anything resembling comics there, as before I always had it down that the first accessible NT comic is in Issue 52.
|
|
|
Post by June Scarlet on Oct 12, 2021 22:06:01 GMT -5
] In terms of matching titles and authors -- this should be able to be done with another script. Depending on how the front-end of the archive is coded (and someone with more background in UX would have a better idea), we could potentially have a "base template" and populate the author, story, etc. as fields within the HTML. Oh my gosh, that's me! I have a background in UX, and I'm currently taking a class in front end web development! I'm learning React JS right now. If you can put the fields into a json format, I could (with work) turn it into a usable site! I'm currently building a website for a comic using react. I'm learning all sorts of stuff.
|
|
|
Post by Killix on Oct 13, 2021 1:01:04 GMT -5
I also ran into the random missing comic in other issues and wondered what happened to them, but then came to a realization when trying to save neo_tomi's stuff. Many of his comics are gone, but he also made a lot of flash comics, so I have a feeling that's the case for those other missing ones.
RIP my one Flash-based comic. X'D
|
|
|
Post by Ziporen on Oct 13, 2021 15:02:59 GMT -5
If no one else is already doing them, I can look into doing the AG and BC entries... or any/all other spotlights, really. But I can't start until tomorrow evening. Edit: Home. Have started saving the AG. I looked into the AG and the BC and have determined that (I think) the AG is much easier to save, at least in terms of modifying my script. For technical details-- To get BC results, one has to input values to a POST webform to get a response. I.e., it's not just a matter of changing a URL -- which would be super simple and the way the other spotlights I have archived thus far are coded.
I could write a macro (maybe?), or maybe I can look into making automatic POST requests. Regardless, it would require a fair bit more research/experimenting on my part to get going. So, perhaps you could work on the BC instead, for now? [/div][/quote] I did the first 250 AG pages before deciding to switch to the BC, actually, because I realized that most of the early AG is probably in the Internet Archive's non-public archives. (Which isn't that helpful to us, but at least means that they do exist somewhere.) Meanwhile, most of the BC entries probably are saved not due to the form. So far I have all Overall and Vandagyre done, along with 1/3 of Zafara. There is a way to link to specific BC winner dates: /beauty/winners.phtml?type=process_species_select&previous_contest=&select_species=Species&prevcon=YYYY-MM-DD The species name in the URL must be capitalized as a proper noun (ex. "Acara", not "acara" or "ACARA") with the exception of Overall results, which are under "main". The results page shows at most 61 winners at a time, or a minimum of 20 weeks of entries.
It's worth noting that I've been saving everything as WARC files.
|
|
|
Post by RielCZ on Oct 13, 2021 23:48:18 GMT -5
There is a way to link to specific BC winner dates: /beauty/winners.phtml?type=process_species_select&previous_contest=&select_species=Species&prevcon=YYYY-MM-DD The species name in the URL must be capitalized as a proper noun (ex. "Acara", not "acara" or "ACARA") with the exception of Overall results, which are under "main". The results page shows at most 61 winners at a time, or a minimum of 20 weeks of entries.
Very nice. Always great to meet someone with "inside information" on these matters. This should make scripting BC archiving much easier, thank you! Oh my gosh, that's me! I have a background in UX, and I'm currently taking a class in front end web development! I'm learning React JS right now. If you can put the fields into a json format, I could (with work) turn it into a usable site! I'm currently building a website for a comic using react. I'm learning all sorts of stuff. Yay, that sounds great! (And I kind of had you in mind 'cause I recalled you did UX heh.) As I mentioned in my earliest post, HERE are some sample crawls. There should be a JSON index file and the appropriate content files there, if/when you might have some time to investigate front-end matters. If you need the JSON file formatted in a different way, let me know and I can see what I can do. RIP my one Flash-based comic. X'D I mean, I checked my records and I do have your SWF from Issue 170 saved. File size is 56K, but IDK how to run it necessarily. EDIT: Have a working Art Gallery archiver. (At least for the "modern" AG template.)
|
|
|
Post by June Scarlet on Oct 14, 2021 10:42:26 GMT -5
RielCZ I had a look, I think the JSON could use a property that links it to the related text/image in the downloaded file, but otherwise I think it's something I can work with. Currently working on my months-long class project that's due at the end of October, so I don't know that I'll have time until then, at the least. But I'll start by working on wireframes if I have time. In general, were people thinking that it should look more or less like the current layout, or updated in a more modern web format?
|
|
|
Post by Ziporen on Oct 14, 2021 15:39:58 GMT -5
RielCZ : If it's a non-interactive SWF (just animation with a "start" frame), then it could just be converted into a video format with Swivel. Otherwise, I can still run Flash files and have tools to decomp & convert SWFs in other ways if needed.
|
|
|
Post by Ginz ❤ on Oct 14, 2021 15:57:07 GMT -5
I took a screenshot of it to preserve it for posterity. (click it to see it bigger, it's an attachment)
Transcript: Dear Neopian Times Writers' Forum, You have my express written consent to preserve all of my NT entries and to link to and reference my article "The Unexpectedly Educational Features of Neopets" on social media and elsewhere when trying to give those Metaverse users a clue of what Neopets is about. You can also share my following statement about my article: (tbc, it will be in quotation marks) Sure, it isn't exactly about describing what Neopets is about, but it does give you great insights about what players can do on there. Neopets is a very deep game with lots of options, so technically it could 'be about' something different for each user. What is definitely clear is that Neopets is not about paywalls or pay-to-win features; the Customization Spotlight even sort of .handicaps paid customizations by preventing ones that contain even one paid (NC) item from competing with 100% free (NP only) entries. Anyone reading this may post the article and statements elsewhere. I'm posting this here because I refuse to create my own social media accounts for any reason (and I don't want to create more accounts in general) but would still allow others to share my voice if they feel like they need to post any more statements from Neopians.
That is all. Have a nice day, Neopians! Also, I haven't been keeping up with the discussion in this thread, but I want to take this chance to say I think archiving the NT is a fantastic idea, and you have my full support for that.
|
|
|
Post by Killix on Oct 14, 2021 18:43:06 GMT -5
RIP my one Flash-based comic. X'D I mean, I checked my records and I do have your SWF from Issue 170 saved. File size is 56K, but IDK how to run it necessarily. Oh, I'm sure I still have mine saved somewhere. I'm just lamenting the idea that Flash comics are no longer viewable in the NT.
|
|
|
Post by Twillie on Oct 15, 2021 17:42:18 GMT -5
That neoboard thread has me kind of curious how they found out about our archiving conversation, if they came here as a guest and saw it, or if they heard about it through other means, like that time it came up on the Neocord. Has me wondering how, with time, talk of this may spread outside of the forum and what that might look like.
|
|
|
Post by RielCZ on Oct 15, 2021 19:55:58 GMT -5
Thanks Ginz ❤ . That neoboard thread has me kind of curious how they found out about our archiving conversation, if they came here as a guest and saw it, or if they heard about it through other means, like that time it came up on the Neocord. Has me wondering how, with time, talk of this may spread outside of the forum and what that might look like. Fair enough. I mean, we could always ask the user, heh. Still, you're right about the word spreading, and perhaps we should be mindful that non-NTWF groups/individuals may eventually want input in the creation/maintenance of The Archives. For another update on my end, I have the Art Gallery archived (both the one at /art/gallery.phtml and the old one at /contributions_pictures.phtml). I also have a working Petpet Spotlight archiver, which is currently running. From what I can tell, the page layout for the Pet Spotlight is essentially the same as that for the Petpet, so it should work for the Pet as well. Still gotta work on the Pre-150 NT script though.
|
|
|
Post by Blueysicle on Oct 15, 2021 22:15:59 GMT -5
Apologies for not keeping up with the thread; been mostly busy saving the Stone Age Issues. As I've said before, it's a very crude way of doing so since I'm just copy-and-pasting the text into a word processor, but I figure that the main thing is that the actual content is saved. It's also quite tedious. Each issue takes me roughly 20-25 minutes, which isn't that long, but it does get kinda mind-numbing. But it does make me appreciate you guys, RielCZ and June Scarlet (and anyone else that's good with technology that I might be forgetting) all the more for being able to archive the entries much faster and efficiently. This would be such a nightmare of a project otherwise, so I'm really thankful for your efforts. So currently I've got Issues 23 - 54 saved. I was about to work on Issue 55 a little while ago, but it seems like that issue is inaccessible, as all the URLs just lead to an error page. There's also an article in Issue 44 that appears to be lost as well. I actually had no idea that Issue 2 - 11 could be accessed! I guess if 23 - 68 was the "Stone Age", 2 - 11 must be like, the Jurassic Period Neopian Times? Going back to the point about real names, another way we could credit them would be including their first name as is and then initialing their last name. So to use the example Killix used, Joe Smith would be Joe S.
|
|