aboutsummaryrefslogtreecommitdiff
path: root/README.md
blob: f82d9f8f6e24ff022174634db0c7d63d60b4207e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
HTTRUTA Crawler Project
=======================

Download all links from a feed using tools like
[httrack](http://www.httrack.com) and [wkhtmltopdf](https://wkhtmltopdf.org).
This is the engine behind the [Cache](https://cache.fluxo.info) feature used by
the [Semantic Scuttle](http://semanticscuttle.sourceforge.net/) instance known
as [Fluxo de Links](https://links.fluxo.info).

Usage
-----

Place this script somewhere and setup a cronjob like this:

`*/5 * * * * /var/sites/cache/httruta/httracker &> /dev/null`