README.mdwn


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

HTTRUTA Feed Crawler Project
============================

Download all links from a feed using httrack. This is the engine behind the
"Cache" feature used by https://links.sarava.org Semantic Scuttle instance.

Usage
-----

Place this script somewhere and setup a cronjob like this:

`*/5 * * * * /var/sites/arquivo/httruta/httracker &> /dev/null`

TODO
----

- Support for other fetchers like youtube-dl and quvi.
- Cleanup content no longer pointed in scuttle database.
- Integration with http://wkhtmltopdf.org