blob: 00d944e7d684c367e8adc088ca9adbdad0b722ce (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
|
article-epub
============
Description
-----------
A command-line tool written in Python to convert scientific articles available as HTML into ePub form for reading on a supported e-reader.
Uses a plugin system with a "recipe" for each supported scientific publisher.
Takes an article URL, title, or (ideally) DOI as input.
Obviously, you need to be able to legally access any article you want to convert, e.g. via a university library.
Like most web scraping applications, the provided recipes are liable to break frequently.
Currently, the following publishers are supported:
* ScienceDirect (Elsevier)
* Springer
* Wiley
* Oxford
* BioOne
* Royal Society
* PLoS ONE
* National Institutes of Health (NIH)
* NRC Research Press
* Taylor & Francis
Dependencies
------------
* Linux environment required
* [Calibre](https://calibre-ebook.com/) (to access `ebook-convert`)
* Firefox with headless support
* [Geckodriver](https://github.com/mozilla/geckodriver/releases) installed somewhere in `$PATH`
* [Pandoc](http://pandoc.org/)
Python packages (available with `pip`):
* [Selenium](http://selenium-python.readthedocs.io/)
* [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)
* [pypandoc](https://github.com/bebraw/pypandoc)
Usage
-----
```
usage: article-epub [-h] [-u URL] [-d DOI] [-t TITLE] [-o FILE] [-p]
optional arguments:
-h, --help show this help message and exit
-u URL URL of article
-d DOI DOI of article
-t TITLE Title of article
-o FILE Name of output file
-p List supported publishers
```
|