CheckManga

A simple web scraper that scrapes manga hosting websites to see if any of the manga I've been reading has been updated. Stores data on each title, manga hosting website, and how to scrape each site in a database. Database updates whenever the web scraper finds a new chapter has been released. Working on implementing a new "bookmarking" feature that will let me keep track of new manga I start reading where I am not yet caught up to the most recent chapter or where the series has been completed.

Instructions

Open a terminal and navigate to the directory with checkmanga.
Use >pip install requirements.txt to instlal modules necessary for checkmanga.py.
Look at myMangaList.json and mySitesList.json as examples for how to properly structure json syntax. Feel free to name your json files differently just be sure to change the names in the generateDB.py file before moving on to the next step.
Run generateDB.py to create a db for your manga. Feel free to change the name from mycheckmanga.db to whatever you want just be sure to change the name in checkmanga.py as well before moving out to the next step.
Run checkmanga.py. Any updates will be printed to terminal!

Update 4/9/2015

-Web scraper is now site independent (removed dependence on a single particular site for all of the manga scraping.)
-Changed database schema so that there are now 3 tables in the SQLite database. (manga, sites, and tags)
-Wrote a function to generate a database from Json files.
-Can now add 'Completed' manga to the db which will not be scraped.
-Distinction between LastChapterRead and MostRecentChapter now.

Useful additions:

-Being able to update, delete, or add entries in bulk rather than just one by one or being able to update, delete, or add based on a json or text file. (Can already generate a db from json files, now need functions for bulk addition or deletion (or update?) from db from a json or text file.)

-Being able to "bookmark" your most recent chapter read for a manga that is either completed or that you just started and haven't caught up to the latest chapter released. (Change Statuses from 'Ongoing' and 'Completed' to 'Scrape' and 'Bookmark'. Figure out how to merge LastChapterRead and MostRecentChapter columns.)

-Figure out naming conventions for the column names in the db and the example json names (how to be consistent despite two different sets of rules?) Set foreign key in the tags table?

-Make branch for this command line version. Flesh out the master branch as a web app with Flask or Django.

-Add in more manga sites.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.gitignore		.gitignore
README.md		README.md
checkmanga.py		checkmanga.py
generateDB.py		generateDB.py
myMangaList.json		myMangaList.json
myMangaSites.json		myMangaSites.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CheckManga

Instructions

Update 4/9/2015

Useful additions:

About

Uh oh!

Releases

Packages

Languages

pjfan/CheckManga

Folders and files

Latest commit

History

Repository files navigation

CheckManga

Instructions

Update 4/9/2015

Useful additions:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages