Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parser for mtgstory.com #1500

Merged
merged 8 commits into from
Sep 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@ eslint/packed.js
eslint/index.csv
node_modules
plugin/jszip/dist/jszip.min.js
package-lock.json
3 changes: 2 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,8 @@
{ "name": "ImmortalDreamer"},
{ "name": "ktrin"},
{ "name": "Tyderion"},
{ "name": "nozwock" }
{ "name": "nozwock"},
{ "name": "Darthagnon"}
],
"license": "GPL-3.0-only",
"bugs": {
Expand Down
85 changes: 85 additions & 0 deletions plugin/js/parsers/MagicWizardsParser.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
/*
MagicWizardsParser.js v0.72

Parser for Magic the Gathering fiction, found on:
- mtgstory.com (redirect)
- https://magic.wizards.com/en/story (2023-2024)
- https://magic.wizards.com/en/articles/columns/magic-story (2014-2018)
- Archive.org versions of the above
- TODO: mtglore.com (redirects & mirrors)
- TODO: https://magic.wizards.com/en/story (Q4 2018-2022)
- TODO: Planeswalkers & Planes Databank
- TODO: Featured story slider Q1 2018
- UNTESTED: http://www.wizards.com/Magic/Magazine/Article.aspx (2014 and earlier)
- WONTFIX: hanweirchronicle.com (Tumblr blog, mostly image posts)
*/
"use strict";

// Register the parser for magic.wizards.com (archive.org is implicit)
parserFactory.register("magic.wizards.com", () => new MagicWizardsParser());

class MagicWizardsParser extends Parser {
constructor() {
super();
}

// Extract the list of chapter URLs
async getChapterUrls(dom) {
let chapterLinks = [];
chapterLinks = [...dom.querySelectorAll("article a, .article-content a, window.location.hostname, #content article a, #content .article-content a, .articles-listing .article-item a, .articles-bloc .article .details a")];
// Filter out author links using their URL pattern
chapterLinks = chapterLinks.filter(link => !this.isAuthorLink(link));
return chapterLinks.map(this.linkToChapter);
}

// Helper function to detect if a link is an author link
isAuthorLink(link) {
const href = link.href;
const authorPattern = /\/archive\?author=/;

// Check if the link matches the author URL pattern or CSS selector
return authorPattern.test(href);
}

// Format chapter links into a standardized structure
linkToChapter(link) {
const titleSelectors = [
"h3", // First option: <h3> tag
".article-item .title", // Second option: <p class="title">
".details .title" // Third option: <p class="title" inside .details>
];

let titleElement = null;

// Iterate through the selectors and find the first matching element
for (const selector of titleSelectors) {
titleElement = link.closest("article")?.querySelector(selector) ||
link.closest(".article-item")?.querySelector(selector) ||
link.closest(".details")?.querySelector(selector);

if (titleElement) {
break; // Exit the loop if a title element is found
}
}

// Fallback to the link text itself if no titleElement found (this handles simpler cases)
let title = titleElement ? titleElement.textContent.trim() : link.textContent.trim();

return {
sourceUrl: link.href,
title: title
};
}

// Extract the content of the chapter
findContent(dom) {
return dom.querySelector("#content article, .article_detail #main-content article, #article-body article, #primary-area section, section article, section, .article_detail #main-content");
}

// Grab cover image
findCoverImageUrl(dom) {
return util.getFirstImgSrc(dom, ".swiper-slide img, article img");
}


}
1 change: 1 addition & 0 deletions plugin/popup.html
Original file line number Diff line number Diff line change
Expand Up @@ -637,6 +637,7 @@ <h3>Instructions</h3>
<script src="js/parsers/LnmtlParser.js"></script>
<script src="js/parsers/MachineTranslationParser.js"></script>
<script src="js/parsers/MadnovelParser.js"></script>
<script src="js/parsers/MagicWizardsParser.js"></script>
<script src="js/parsers/MangadexParser.js"></script>
<script src="js/parsers/MandarinducktalesParser.js"></script>
<script src="js/parsers/MangakakalotParser.js"></script>
Expand Down
1 change: 1 addition & 0 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ Credits
* ktrin
* nozwock
* Tyderion
* Darthagnon

## How to use with Baka-Tsuki:
* Browse to a Baka-Tsuki web page that has the full text of a story.
Expand Down
Loading