[[!meta title="Data science, lean databases and formats"]] ## Basic * Ontologies and how to deal with lists. * Standards: schema.org, microdata, microformats, json, yaml, csv, dot, vcard. * Intelligence: how to easilly search, index and produce outputs with strutured data? * Samples: TODO and [ChangeLog](http://keepachangelog.com) (see [yankee: Changelogs meet YAML](https://github.com/studio-b12/yankee)). ## Software * [mtail](https://packages.debian.org/stable/mtail). * [Scrapy | A Fast and Powerful Scraping and Web Crawling Framework](https://scrapy.org/). * [phantomjs in stretch](https://packages.debian.org/stable/phantomjs). * [wpull](https://wpull.readthedocs.io/en/master/usage.html). * [Darktable - virtual lighttable and darkroom for photographers](https://packages.debian.org/stable/darktable). * OsmAnd and GPX tracks. ## API, bigdata, etc * https://stripe.com/blog/idempotency * https://botman.io * https://github.com/metabase/metabase * [Apache Drill](https://drill.apache.org/), [presto](https://github.com/prestodb/presto), hadoop, etc. * [Redash](https://redash.io/). * [TensorFlow](https://www.tensorflow.org/). * [Wikidata](https://www.wikidata.org). * [Swagger Specification](http://swagger.io/specification/). ## Datasets * [API de respostas instantâneas do DuckDuckGo](https://duckduckgo.com/api) ([example](http://api.duckduckgo.com/?q=micropython&format=json&pretty=1)). * [Search APIs | ProgrammableWeb](https://www.programmableweb.com/category/search/apis?category=20055). * [Have I been pwned? API v2](https://haveibeenpwned.com/API/v2).