The Unremarkable Adventures of an Electron: imagemagick

Showing posts with label imagemagick. Show all posts

Wednesday, July 29, 2015

Read and write animated gifs with MATLAB

UPDATE: These scripts have been vastly improved. Find the current versions here.

In the process of developing my own image mangling toolbox for Matlab, I had routine need for generating animated gif files from image sequences. It then follows that said .gif files might need to be read back again. Matlab's inbuilt functions imread() and imwrite() don't make this trivial. There are some suggestions floating around the internet, and there are some unanswered reports of unexpected behavior. Not everything worked simply, but I came up with my own ways.

First, writing the image was fairly simple. I had originally been using imagemagick to do the heavy lifting, but decided to go for a more direct route. As mentioned, the imagemagick method does seem to have better output, but it's quite a bit slower in my experience.

function gifwrite(inarray,filepath,delay,method)
%   GIFWRITE(INARRAY, FILEPATH, {DELAY}, {METHOD})
%       write image stack to an animated gif
%       
%   INARRAY: 4-D image array (rgb, uint8)
%   FILEPATH: full name and path of output animation
%   DELAY: frame delay in seconds (default = 0.05)
%   METHOD: animation method, 'native' or 'imagemagick' (default = 'native')
%       'imagemagick' may have better quality, but is much slower

if nargin<4;
    method='native';
end

if nargin<3;
    delay=0.05;
end

numframes=size(inarray,4);

if strcmpi(method,'native');
    disp('creating animation')
    for n=1:1:numframes;
        [imind,cm]=rgb2ind(inarray(:,:,:,n),256);
        if n==1;
            imwrite(imind,cm,filepath,'gif','DelayTime',delay,'Loopcount',inf);
        else
            imwrite(imind,cm,filepath,'gif','DelayTime',delay,'WriteMode','append');
        end
    end
else
    disp('creating frames')    
    for n=1:1:numframes;
        imwrite(inarray(:,:,:,n),sprintf('/dev/shm/%03dgifwritetemp.png',n),'png');
    end
    
    disp('creating animation')
    system(sprintf('convert -delay %d -loop 0 /dev/shm/*gifwritetemp.png %s',delay*100,filepath));
    
    disp('cleaning up')    
    system('rm /dev/shm/*gifwritetemp.png');
end
return

Reading the image back wasn't straightforward at all. The naive approach suggested by the documentation and posts online does not result in correct output. Imread(...'frames','all') returns only a single colormap corresponding to the global color table in the file. Any file containing multiple images with local color tables will turn into a pile of garbage.

function outpict=gifreadcrap(filepath)
%   GIFREADCRAP(FILEPATH)
%       reads all frames of an animated gif into a 4-D RGB image array
%       seems imread() cannot correctly read animated gifs

[images map]=imread(filepath, 'gif','Frames','all');

s=size(images);
numframes=s(4);

outpict=zeros([s(1:2) 3 numframes],'uint8');
for n=1:1:numframes;
    outpict(:,:,:,n)=ind2rgb8(images(:,:,:,n),map);
end

return


Original image written with gifwrite() and results as returned by gifreadcrap()

Using imread() to read single frames in the hope of perhaps getting correct color information produced the exact same results. Imfinfo() can be used to fetch file information, and certainly, it returns color tables which should correspond to the LCTs in the file. This is where things get ugly.

I spent some time with a hex editor and the file standards documentation to verify what I suspected was a bug. While imfinfo() returned the LCT data, they were all shifted by exactly one byte. The immediately adjacent bytes were being read correctly, though. Sounds like an OBOE to me!

At this point, I discover that yes, indeed it is a bug in versions R14-2012a. A patch exists, but for the sake of anyone who doesn't care, I decided to integrate both the native solution and my own imagemagick workaround.

Keep in mind that it may still be necessary to coalesce the animation before opening it, depending on what you want to do in Matlab. If you're trying to import an optimized gif, you can use the included option to coalesce the image. This optional mode is similar to the other optional modes in these two functions in that it requires external tools and assumes a linux environment. All temporary file operations utilize /dev/shm for marginal speed improvement. Altering this temporary path for your own environment should be trivial. Both functions should work in default modes on other systems, but I have no intention of testing that.

function outpict=gifread(filepath,method,coalesce)
%   GIFREAD(FILEPATH, {METHOD}, {COALESCE})
%       reads all frames of an animated gif into a 4-D RGB image array
%       
%   FILEPATH: full path and filename
%   METHOD: file read method, 'native' or 'imagemagick' (default = 'native')
%       'imagemagick' method is a workaround for bug 813126 present in
%       R14SP3-2012a versions.  Bug consists of an OBOE in reading LCT data.
%       A patch does exist for these versions:
%       https://www.mathworks.com/support/bugreports/813126
%   COALESCE: 0 or 1, Specifies whether to coalesce the image sequence prior to
%       importing.  Used when loading optimized gifs. Requires imagemagick.
%       (optional, default 0)
 
if nargin<3;
    coalesce=0;
end
if nargin<2;
    method='native';
end
 
 
if coalesce==1
    system(sprintf('convert %s -layers coalesce /dev/shm/gifreadcoalescetemp.gif',filepath));
    filepath='/dev/shm/gifreadcoalescetemp.gif';
end
 
if strcmpi(method,'native')
    % use imread() directly (requires patched imgifinfo.m)
    [images map]=imread(filepath, 'gif','Frames','all');
    infostruct=imfinfo(filepath);
 
    s=size(images);
    numframes=s(4);
 
    outpict=zeros([s(1:2) 3 numframes],'uint8');
    for n=1:1:numframes;
        LCT=infostruct(1,n).ColorTable;
        outpict(:,:,:,n)=ind2rgb8(images(:,:,:,n),LCT);
    end
else
    % split the gif using imagemagick instead
    system(sprintf('convert %s /dev/shm/%%03d_gifreadtemp.gif',filepath));
    [~,numframes]=system('ls -1 /dev/shm/*gifreadtemp.gif | wc -l');
 
    numframes=str2num(numframes);
    [image map]=imread('/dev/shm/000_gifreadtemp.gif', 'gif');
    s=size(image);
 
    outpict=zeros([s(1:2) 3 numframes],'uint8');
    for n=1:1:numframes;
        [image map]=imread(sprintf('/dev/shm/%03d_gifreadtemp.gif',n-1), 'gif');
        outpict(:,:,:,n)=ind2rgb8(image,map);
    end
 
    system('rm /dev/shm/*gifreadtemp.gif');
end
 
if coalesce==1
    system(sprintf('rm %s',filepath));
end
 
return

While all of this works, the native reading method does require either a patched copy of imgifinfo.m or Matlab version 2012b or later. Of course, once the existence of the bug was known, fixing things was simple. All the hours over the last day and a half that I spent digging for answers online and with a hex editor were to only rediscover something that was already known but hidden behind MathWorks' member login. It kind of pisses me off enough that finding the solution does not resolve my focus. What possible purpose does restricting access to bug reports serve?

If there's one thing I should have learned from my experiences with asking things of forums, it's to never ask forums. The internet is littered with the evidence of my inability to learn this simple lesson. This very blog was an angry reaction to the previous spectacularly infuriating experience. If I'm bound to ask questions of a silent screen -- if I'm bound to carve my own conclusions and place them on them complete on someone else's mantle in the meager hope that I can help the next person avoid my fate, then I might as well do it in a squalor of my own crafting, without the unrealistic expectations of interaction tugging at my attention.

Tuesday, June 16, 2015

Digitize all those binders full of notes

Ever find yourself referring to your old notes and school work? Why not cram that giant stack of sketchy, error-riddled paper into a pdf or nine? Just think of all the advantages!

they would be more portable - you could put them on a usb stick or a phone
they would be easier to reference while doing design work on the computer
they would take up less space and collect fewer dead bugs
they might not get damaged by the leaky roof or eaten by termites
they could be shared with people who want to learn that they can't read your shitty handwriting

Maybe you could even have the prescience to approach this task before you're two years past graduation!

Just some books and binders

Over my many years wasting my life with a bad drawing habit, I've learned one thing from flatbed scanners: graphite pencil marks on paper are reflective. This means that for particular illumination angles, scanned or photographed pencil handwriting/drawings may be washed out very effectively. Long ago, I simply abandoned scanners for drawing purposes because of this. I decided to simply use my second-hand smartphone as a camera. I grabbed some scrap metal from the pile next to the bandsaw, and after some sawing and hammering and hot glue, I created a tripod mount for the phone. I set up a light tent and made a staging easel out of a pizza box. After ten minutes of page flipping, I have a hundred pictures of upside-down pages that I can hardly read with my D-grade eyeballs.


Rectification and contrast enhancement

Now, how to process the images? For most purposes, processing of scanned books or handwriting can be done with a simple procedure:

desaturate original image
make a copy
blur the copy with a large kernel
blend the blurred mask with the original in a Divide mode
re-adjust levels as appropriate

This works well and produces a high-contrast image. Keep in mind though why it works. This is essentially a high-pass filter method. Only the low spatial frequency content survives the blurring operation and is removed in the blend operation. This removes the effects of uneven lighting or slight paper contours, but if the page content is not restricted to narrow lines, we'll run into problems.


Excess filtering on pages with graphics

Let's say some of the pages have printed figures or tables; the removal of low-frequency content will tend to leave only edges of any solid dark regions. In the binders that contained occasional printed graphics, I used a different method for processing printed pages. Since most of my handwritten notes are on yellow paper, I simply processed non-yellow pages differently. If I know there are no graphics, I can just ignore the testing routine.

The color-testing routine finds the average color of an annulus of the page so as to ignore content and page placement inaccuracy. One convenience of this is that images that are processed with the high-pass filter method can be re-colorized if desired. I personally don't find this to help with visual contrast, so I didn't use it.

#!/bin/bash
# process photos of coursework pages from binders
# uses slower method of contrast mask generation and overlay 

#581 1773x2283+45+543 180
#487 1758x2286+54+546 180
#488 2220x1716+321+36 270
#221 1785x2325+51+531 180
#GDn 1755x2394+24+657 180
#471 1803x2319+33+540 180
#537 1779x2286+45+552 180

pdfname="ECE537_Integrated_Photonics"
cropsize="1779x2286+45+552" #the rect parameters for cropping
rotateamt="180"    #how much to rotate after cropping

indir="originals" #this is the local directory name where the original images are
outdir="output"   #this is the local directory name where the script will shit out the processed pages
outqual=85   #percent jpeg quality
hconly=1   #always assume pages are handwritten (used when there aren't any printed graphics pages)
retint=0   #percent retint for yellow pages
retintmode="multiply"

# ###########################################################################
if [ $hconly == 1 ]; then 
 echo "high contrast mode"
else
 echo "auto contrast mode"
fi

page=1
for infile in $(ls -1 $indir/*.jpg); do
 outfile="$outdir/output-$(printf "%04d" $page).jpg"
 jpegtran -crop $cropsize -trim -copy none $infile | \
 jpegtran -rotate $rotateamt -trim -outfile temp.jpg
 
 if [ $hconly == 0 ]; then 
  # get average page color excluding border and content
  imgstring=$(convert \( temp.jpg -threshold -1 -scale 95% \) \
    \( temp.jpg -threshold 100% -scale 80% \) \
   -gravity center -compose multiply -composite - | \
   convert temp.jpg - -alpha off -gravity center -compose copy_opacity -composite -resize 1x1 txt:)
  RGB=$(echo $imgstring | sed 's/ //g' | sed 's/(/ /g' | sed 's/)/ /g' | sed 's/,/ /g' | cut -d ' ' -f 6-8)
  R=$(echo $RGB | cut -d ' ' -f 1)
  G=$(echo $RGB | cut -d ' ' -f 2)
  B=$(echo $RGB | cut -d ' ' -f 3)
  isyel=$(echo "($R+$G)/2 > $B*1.3" | bc)
  #echo $imgstring
  echo $R $G $B ">> $page is yellow? >>" $isyel
 fi

 if [ $hconly == 1 ] || [ $isyel == 1 ]; then
  # if page is yellow, do 100% contrast enhancement and partial page re-tinting 
  if [ $retint != 0 ]; then 
   convert -modulate 100,0 temp.jpg - | \
   convert - \( +clone -filter Gaussian -resize 25% -define filter:sigma=25 -resize 400% \) -compose Divide_Src -composite - | \
   convert -level 70%,100% -quality 100 - temp.jpg
   convert temp.jpg \( +clone -background "rgb($R,$G,$B)" -compose Dst -flatten \) -compose $retintmode -composite - | \
   convert temp.jpg - -compose blend -define compose:args=$retint -composite - | \
   convert -quality $outqual - $outfile
  else
   convert -modulate 100,0 temp.jpg - | \
   convert - \( +clone -filter Gaussian -resize 25% -define filter:sigma=25 -resize 400% \) -compose Divide_Src -composite - | \
   convert -level 70%,100% -quality $outqual - $outfile  
  fi
 else
  # if page is not yellow, retain most color and do a 50% contrast enhancement
  convert -modulate 100,80 temp.jpg - | \
  convert - \( +clone -filter Gaussian -resize 25% -define filter:sigma=25 -resize 400% \) -compose Divide_Src -composite - | \
  convert - temp.jpg -compose blend -define compose:args=50 -composite - | \
  convert -level 25%,100% -quality $outqual - $outfile
 fi

 #echo $infile
 page=$[$page+1]
done

rm temp.jpg
convert $outdir/*.jpg $pdfname.pdf

The blurring method entails a size reduction and expansion. This has two purposes: First, it speeds up blurs with a large kernel (in this case, by about a factor of 3); second, it helps reduce vignetting effects that would otherwise be caused by a simple "-blur 0,100" operation. If a simple blur is used, it would help to crop the page oversize, then trim it down after contrast enhancement or after the blur itself.


Difference between simple blur and resize+blur methods

Of course you can guess that I'd do this in bash. This is about as ad-hoc as they come. It's not even externally parameterized. I was thinking about doing this with Matlab, but I decided that I'd rather pull my hair out trying to do image stacks in ImageMagick. Tell it where the files are, how they should be checked and compressed, and the script will grind them into a pdf for you. I highly doubt anyone would ever actually use this ugly code, but I'm posting it anyway because I have nothing else to do. Still, I'm not dumb enough to post my horrible notes in whole.

Wednesday, September 10, 2014

Extract a document from an online reader

I'm glad i'm not going through first year undergrad general studies again.
The horseshit and scams that surround these programs never ceases to amaze me.

Someone showed me the online reader they had to use for their classes. Though i'm sure there are others that are equally shitty, this Brytewave digibook reader from Follett was unusable. It was extremely slow, and navigation was made difficult because it would tend to direct the user to random pages instead of the ones specified. It was cram time at the end of the semester, and the book she's trying to study keeps closing and going to random pages. That's how e-book technology has revolutionized education and brought us to a bright new era of virtual learning! Fuck the rent-seeking charlatans and the government money they rode in on.

That's the extent of the rant. How to get the fucking shitty book as a pdf:
As you can guess, i wrote a script to capture the document. Actually, i wrote two scripts. The first script uses xdotool and scrot (a fast and script-friendly screenshot tool) to click through the document and capture all the image data. It helps to have a large monitor, otherwise, one can zoom in and take multiple shots per page.

#!/bin/bash
# click through drm'd digibooks in brytewave reader and copy them as screenshots

clickwait=1       
pagewait=20    # seconds to wait for next page to load
startpage=1     # useful in case of crash (it happens)
endpage=400   # the last reader page 
dumppath="/home/personface/pileofshittybooks"
docname="governmentbook"

xdotool search --screen 0 --name "BryteWave" windowfocus; 
page=$startpage
while [ $page -le $endpage ]; do
    scrot "$dumppath/$docname-$page.jpg"

    # use a screenshot to find button coordinates    
    sleep $clickwait; xdotool mousemove 1907 630
    sleep 0.1; xdotool click 1
    sleep $pagewait
    
    page=$[page+1]
done

The second script cuts the two pages out of each screenshot and compiles a minimal pdf from the images.

#!/bin/bash
# disassemble screenshots made of bryteclicker

docprefix="governmentbook"
pdfname="Fascist_Propaganda_by_CKSucker_10ed"

inext="jpg"
outext="jpg"
outqual=50 
endonduplicate=1

page=1
lastsize=0
for infile in $(ls -1tr $docprefix*.$inext); do
    # check for consecutive duplicates since screengrabber cannot verify page loads
    # if flag is set, assume duplicates indicate screengrabber is stalled on last page of document
    # useful when screengrabber doesn't know exact document size and is set with excess pagecount
    thissize=$(ls -l $infile | cut -d " " -f 5)
    if [ $lastsize == $thissize ]; then
        echo "$infile may be a duplicate of the previous file!"
        if [ $endonduplicate == 1 ]; then
            echo "i'm going to assume this is the end of the document"
            break;
        fi
    fi
    lastsize=$thissize

    # crop pages from screenshot
    # use GIMP to get coordinates
    convert -crop 670x1005+282+122 -quality $outqual $infile "output-$(printf "%04d" $page)-A.$outext"
    convert -crop 670x1005+968+122 -quality $outqual $infile "output-$(printf "%04d" $page)-B.$outext"
    page=$[$page+1]
done

convert output*.$outext $pdfname.pdf

Like i mentioned, one can get better image quality by making each page span more than one screenshot, but this complicates post-processing a tad. The quality settings in the second script can also be tweaked for better output. My method was a tradeoff between readability and file size reduction.