Medium List Scraper

Description

Are you a heavy user of Zotero and Medium? Do you want to apply your Zotero color coding to Medium articles and export the notes to Obsidian? This tool makes it possible!

This Python script allows you to scrape articles from multiple lists on Medium.com and save them into a library.csv. The library serves as a local database, and the script checks for differences between the library and scraped articles for every scrape attempt, saving new articles into the library.

Additionally, this web scraper downloads each article as a PDF using scribe.rip, an alternative frontend for Medium articles, for more beautifully formatted PDFs. This process uses Percollate, a command-line tool that turns web pages into elegant PDFs.

The downloaded PDFs are saved into separate folders (one for each list) within a downloads folder.

Finally, you can add these downloaded PDFs to Zotero to begin your note-taking and color-coding process.

Functionality

Scraping Medium Lists: The script scrapes articles from your specified Medium lists and saves them into a local library.csv file.
PDF Generation: The scraper downloads each article as a PDF using scribe.rip and Percollate for a visually appealing format.
File Organization: The downloaded PDFs are organized into separate folders for each list within a downloads folder.
Zotero Integration: You can import the downloaded PDFs into Zotero for note-taking and color-coding.

Requirements

Installation

Clone or download the repository.
Set up a virtual environment by running python3 -m venv .venv and activate it.
Install all dependencies with pip install -r requirements.txt.
Install Percollate by following the instructions here.
Edit the medium_lists.csv file to add your own public lists.

Usage

Run the scraper with the following command: python3 -m medium_scraper.py.

Limitations

This scraper is built for the Brave Browser. You must install this browser to use the scraper, or edit the code to use Chrome (for more information, read this).
The scraper will only work with public lists.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
medium_lists.csv		medium_lists.csv
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

main.py

main.py

medium_lists.csv

medium_lists.csv

requirements.txt

requirements.txt

Repository files navigation

Medium List Scraper

Table of Contents

Description

Functionality

Requirements

Installation

Usage

Limitations

About

Languages

License

Christoph-Beckmann/Medium-List-Scraper

Folders and files

Latest commit

History

Repository files navigation

Medium List Scraper

Table of Contents

Description

Functionality

Requirements

Installation

Usage

Limitations

About

Topics

Resources

License

Stars

Watchers

Forks

Languages