Skip to content

Plugin based RSS feed generator for sites that don't offer any. Serves RSS, Atom and JSON Feeds.

License

Notifications You must be signed in to change notification settings

dewey/feedbridge

Repository files navigation

feedbridge gopher

Feedbridge

Build Status GoDoc Go Report Card Maintainability MIT License Badges

Is a tool (Hosted version / Demo: feedbridge.notmyhostna.me) to provide RSS feeds for sites that don't have one, or only offer a feed of headlines. For each site—or kind of site—you want to generate a feed for you'll have to implement a plugin with a custom scraping strategy. Feedbridge doesn't persist old items so if it's not on the site you are scraping any more it won't be in the feed. Pretty similar to how most feeds these days work that only have the latest items in there. It publishes Atom, RSS 2.0, and JSON Feed Version 1 conform feeds.

There are a bunch of web apps doing something similar, some of them you can even drag and drop selectors to create a feed. That didn't work well for the site I was trying it for so I decided to built this. (Also it was fun doing so).

API

GET /feed/list

Returns a list of available plugins.

GET /feed/{plugin}/{format}

Returns the feed based on a given plugin and output format. That's the URL you should use in your feed reader.

  • plugin: The name of the plugin as returned by String()
  • format: The format the feed should be returned in, can be rss, atom or json. By default it's RSS.

POST /feed/{plugin}/refresh (Authentication required)

Route to trigger a refresh for a given plugins, this runs a single scrape of the given plugin.

GET /metrics

Returns the exported Prometheus metrics.

API Authentication

Is done through a query parameter (auth_token), it's configured via the API_TOKEN environment variable.

Configuration and Operation

Environment

The following environment variables are available, they all have sensible defaults and don't need to be set explicity.

  • REFRESH_INTERVAL: The interval in which feeds get rescraped in minutes (Default: 15)
  • API_TOKEN: A user defined token that is used to protect sensitive API routes (Default: changeme)
  • ENVIRONMENT: Environment can be prod or develop. develop sets the loglevel to info (Default: develop)
  • PORT: Port that Feedbridge is running on (Default: 8080)

There are two available storage backends right now. An in-memory and a disk backed implementation. Depending on which one you choose there are additional options you can set.

  • STORAGE_BACKEND: Set to memory to keep everything in-memory or persistent to persist the cache to disk. (Default: memory)

In Memory

  • CACHE_EXPIRATION: The expiration time of the cache in minutes (Default: 30)
  • CACHE_EXPIRED_PURGE: The interval at which the expired cache elements will be purged in minutes (Default: 60)

Persistent

  • STORAGE_PATH: Set the storage location of the cache on disk. (Default: ./feedbridge-data)

Run with Docker

You can change all these options in the included docker-compose files and use docker-compose -f docker-compose.yml up -d to run the project. There are two sample docker compose files that already have the settings for the two available backends set.

Monitoring

As the project already exports Prometheus metrics you can use Grafana to get more information about how many things are being scraped and how fast requests are served. You can import the included grafana-dashboard.json in Grafana.

Status

This is a work in progress and pretty rough right now. The API might change and things get moved around.

Acknowledgements & Credits

It's using the neat gorilla/feeds library to generate standard conform Atom, RSS 2.0 and JSON Feeds. The Gopher was sourced from github.com/egonelbre, the RSS icon is coming from Wikipedia and was added by me. Thanks!