Santagate 2019 Pro for Workgroups
walthervonstolzing
That is a great change to the papers of the past where you have to have an affiliation to a university to get access to a paper and sometimes even that is not enough.
'Oxford Scholarship Online' would license different sets of books to different departments; so someone from the philosophy department couldn't get access to books classified under sociology or history.
Imagine doing something similar at the checkout table in a 'physical' library.
Here's another video: https://www.youtube.com/watch?v=PriwCi6SzLo (including an interview with the great Alexandra Elbakyan).
Cory Doctorow recently wrote about this in some detail (incl. helpful links): https://pluralistic.net/2024/08/16/the-public-sphere/#not-the-elsevier
The name of the pdf file inside the torrent is its md5 hashsum without the .pdf extension.
On libgen.rs you can see the md5 hashsum on the download page; on libgen.li you need to look at the JSON file provided at the link on the search result , as they don't render it on the ui.
The torrents are alive; as long as you can get the torrent links from libgen, you have access to the files. (No need to share whole archives either, you can pick & choose).
Wouldn't enabling the --system-site-packages
flag during venv creation do exactly what the OP wants, provided that gunicorn is installed as a system package (e.g. with the distro's package manager)? https://docs.python.org/3/library/venv.html
Sharing packages between venvs would be a dirty trick indeed; though sharing with system-site-packages
should be fine, AFAIK.
Michael W. Lucas's "Networking for System Administrators" is a great resource: https://mwl.io/nonfiction/networking#n4sa
That's not a consideration in favor of grouping h/j as the 'back keys', and k/l as the 'forward' keys, though. It's perfectly comfortable & intuitive to have the index finger on the key that goes forward.
Why, though? Why is it so obvious that j 'should have' been [edit: up]?
Sure if you drag it through the garden.
PyMuPDF is excellent for extracting 'structured' text from a pdf page — though I believe 'pulling out relevant information' will still be a manual task, UNLESS the text you're working with allows parsing into meaningful units.
That's because 'textual' content in a pdf is nothing other than a bunch of instructions to draw glyphs inside a rect that represents a page; utilities that come with mupdf or poppler arrange those glyphs (not always perfectly) into 'blocks', 'lines', and 'words' based solely on whitespace separation; the programmer who uses those utilities in an end-user facing application then has to figure out how to create the illusion (so to speak) that the user is selecting/copying/searching for paragraphs, sentences, and so on, in proper reading order.
PyMuPDF comes with a rich collection of convenience functions to make all that less painful; like dehyphenation, eliminating superfluous whitespace, etc. but still, need some further processing to pick out humanly relevant info.
Built-in regex capabilities of Python can suffice for that parsing; but if not, you might want to look into NLTK tools, which apply sophisticated methods to tokenize words & sentences.
EDIT: I really should've mentioned some proper full text search tools. Once you have a good plaintext representation of a pdf page, you might want to feed that representation into tools like the following to index them properly for relevant info:
https://lunr.readthedocs.io/en/latest/ -- this is easy to use, & set up, esp. in a python project.
... it's based on principles that are put to use in this full-scale, 'industrial strength' full text search engine: https://solr.apache.org/ -- it's a bit of a pain to set up; but python can interface with it through any http client. Once you set up some kind of mapping between search tokens/keywords/tags, the plaintext page, & the actual pdf, you can get from a phrase search, for example, to a bunch of vector graphics (i.e. the pdf) relatively painlessly.
Finally!