README.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

HTTRUTA Crawler Project
=======================

Download all links from a feed using tools like
[httrack](http://www.httrack.com) and [wkhtmltopdf](https://wkhtmltopdf.org).
This is the engine behind the [Cache](https://cache.fluxo.info) feature used by
the [Semantic Scuttle](http://semanticscuttle.sourceforge.net/) instance known
as [Fluxo de Links](https://links.fluxo.info).

Usage
-----

Place this script somewhere and setup a cronjob like this:

`*/5 * * * * /var/sites/cache/httruta/httracker &> /dev/null`