The horseshit and scams that surround these programs never ceases to amaze me.
Someone showed me the online reader they had to use for their classes. Though i'm sure there are others that are equally shitty, this Brytewave digibook reader from Follett was unusable. It was extremely slow, and navigation was made difficult because it would tend to direct the user to random pages instead of the ones specified. It was cram time at the end of the semester, and the book she's trying to study keeps closing and going to random pages. That's how e-book technology has revolutionized education and brought us to a bright new era of virtual learning! Fuck the rent-seeking charlatans and the government money they rode in on.
That's the extent of the rant. How to get the fucking shitty book as a pdf:
As you can guess, i wrote a script to capture the document. Actually, i wrote two scripts. The first script uses xdotool and scrot (a fast and script-friendly screenshot tool) to click through the document and capture all the image data. It helps to have a large monitor, otherwise, one can zoom in and take multiple shots per page.
#!/bin/bash # click through drm'd digibooks in brytewave reader and copy them as screenshots clickwait=1 pagewait=20 # seconds to wait for next page to load startpage=1 # useful in case of crash (it happens) endpage=400 # the last reader page dumppath="/home/personface/pileofshittybooks" docname="governmentbook" xdotool search --screen 0 --name "BryteWave" windowfocus; page=$startpage while [ $page -le $endpage ]; do scrot "$dumppath/$docname-$page.jpg" # use a screenshot to find button coordinates sleep $clickwait; xdotool mousemove 1907 630 sleep 0.1; xdotool click 1 sleep $pagewait page=$[page+1] done
The second script cuts the two pages out of each screenshot and compiles a minimal pdf from the images.
#!/bin/bash # disassemble screenshots made of bryteclicker docprefix="governmentbook" pdfname="Fascist_Propaganda_by_CKSucker_10ed" inext="jpg" outext="jpg" outqual=50 endonduplicate=1 page=1 lastsize=0 for infile in $(ls -1tr $docprefix*.$inext); do # check for consecutive duplicates since screengrabber cannot verify page loads # if flag is set, assume duplicates indicate screengrabber is stalled on last page of document # useful when screengrabber doesn't know exact document size and is set with excess pagecount thissize=$(ls -l $infile | cut -d " " -f 5) if [ $lastsize == $thissize ]; then echo "$infile may be a duplicate of the previous file!" if [ $endonduplicate == 1 ]; then echo "i'm going to assume this is the end of the document" break; fi fi lastsize=$thissize # crop pages from screenshot # use GIMP to get coordinates convert -crop 670x1005+282+122 -quality $outqual $infile "output-$(printf "%04d" $page)-A.$outext" convert -crop 670x1005+968+122 -quality $outqual $infile "output-$(printf "%04d" $page)-B.$outext" page=$[$page+1] done convert output*.$outext $pdfname.pdf
Like i mentioned, one can get better image quality by making each page span more than one screenshot, but this complicates post-processing a tad. The quality settings in the second script can also be tweaked for better output. My method was a tradeoff between readability and file size reduction.
No comments:
Post a Comment