Friday, November 21, 2014

Matlab: Encode images in audio spectrum

UPDATE: This sandbox script has evolved into the functions im2spectrogram() and text2spectrogram() in my aptly named Matlab Image Mangling Toolbox

A while back, I got bored and was looking for various ways to shove pictures where they didn't belong.  Among simpler ideas like concatenation, arbitrary character encoding schemes, and spreadsheet conversion, I tried my hand at conversion to audio.  Using this particular STFT/ISTFT set of tools, as well as basic parts of the image processing toolbox, I threw together this kludge:
%% simplified automatic spectrogram obfuscation

clc; clear all;
format compact;

% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

projdir='/home/assbutt/projects/imagepooper/';
inpict=imread([projdir 'test.jpg'], 'jpeg');
outwavname=[projdir 'soundpicture.wav']; 

% typical image adjustments
invert=0;           % invert if 1
flip=0;             % flip horizontal if 1
bluramt=1;          % gaussian blur amount (approx 0.5-10) (zero for no blur)
blurrad=3;          % gaussian blur radius (approx 2-5)

alteraspect=0.80;   % correct for viewer distortion
padbar=0.08;        % relative height of top padding (H=1+padwidth)
volume=1;           % adjust signal volume (will clip beyond unity)

% add blur to reduce bright edge artifacts
% use nonzero padbar to keep image below mp3 cutoff (0 for none)

% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% desaturate color images (choose method if desired)
if (ndims(inpict) == 3); inpict=mrgb2gray(inpict,'default'); end

% invert where requested
if (invert == 1); inpict=255-inpict; end

% flip where requested
if (flip == 1); inpict=fliplr(inpict); end

inpict=flipud(inpict);      % flip image (low-f on bottom)
inpict=imadjust(inpict);    % set image contrast

magicnum=660;       
inaspect=length(inpict(1,:))/length(inpict(:,1));
inpict=padarray(inpict,[round(padbar*length(inpict(:,1))) 0],'post'); 
inpict=imresize(inpict,magicnum*[1 alteraspect*inaspect/(1+padbar)]);

if (bluramt > 0);
    h=fspecial('gaussian', blurrad*[1 1], bluramt);
    inpict=imfilter(inpict,h);
end

nft=length(inpict(:,1))*2-2;
h=nft; 
samplefreq=44100;
[x,t]=istft(inpict,h,nft,samplefreq);
[stft, f, t_stft]=stft(x, nft, h, nft, samplefreq);

figure(1)
subplot(1,2,1); plot(t, x);
subplot(1,2,2); imshow(flipdim(real(stft),1));
cmap=colormap('gray');
colormap(flipud(cmap))

xp=x/(max(abs(x))*1.0001)*volume;
wavwrite(xp, samplefreq, 32, outwavname)
Of course, one kind of has to guess at the transform size a person might use when configuring the script parameters, otherwise the arbitrariness of frequency-time scaling makes it basically impossible to enforce any first-view aspect ratio. Most photographs sound pretty terrible, though feeding it high-contrast images with little white content produce nicer audio sweeps.

We take this image
And the script poops out an audio file with a spectrogram like this (using foobar2000)
View your output with foobar2000, audacity, or baudline.  The behavior could probably be improved, but this script is just a novelty.  Nobody cares. I don't care anymore either.

At the time I made this, I also pooped out a version that encodes text strings as a marquee in the top end of the frequency spectrum.  The output is a sound file containing the text.  Simply mix one or more of these files with some music to obtain a song full of hidden inaudible text. I probably should've just made the core of these scripts modular, but I wasn't sure what parts would be common.
%% create text marquee for spectrogram

clc; clear all;
format compact;

% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

instring='put a whole shitload of text here';

projdir='/home/assbutt/projects/imagepooper/';
outwavname=[projdir 'soundpicture.wav']; 

% typical image adjustments
alteraspect=0.80;   % correct for viewer distortion
textheight=0.03;    % relative height of text
textlocation=19000; % frequency center of text
volume=0.1;         % adjust signal volume (will clip beyond unity)

bluramt=5;          % gaussian blur amount (approx 0.5-10) (zero for no blur)
blurrad=3;          % gaussian blur radius (approx 2-5)

% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

magicnum=660;       
samplefreq=44100;

inpict=uint8(text2im(instring));
inpict=255*(1-inpict);

toplim=samplefreq*(1-textheight/2)/2;
botlim=samplefreq/2-toplim;
if (textlocation > toplim); 
    textlocation=toplim; disp('supramaximal text center');end
if (textlocation < botlim); 
    textlocation=botlim; disp('subminimal text center');end

inaspect=length(inpict(1,:))/length(inpict(:,1));
inpict=imresize(inpict,magicnum*textheight*[1 inaspect]);
botpad=floor(magicnum*((2*textlocation/samplefreq) - textheight/2));
toppad=ceil(magicnum*(1-textheight)-botpad);

inpict=padarray(inpict,[0 10],'both'); % pad ends
inpict=padarray(inpict,[toppad 0],'pre'); % top pad
inpict=padarray(inpict,[botpad 0],'post'); % bottom pad
inaspect=length(inpict(1,:))/length(inpict(:,1)); % recalculate
inpict=imresize(inpict,magicnum*[1 alteraspect*inaspect]);
inpict=flipud(inpict);      % flip image (low-f on bottom)

if (bluramt > 0);
    h=fspecial('gaussian', blurrad*[1 1], bluramt);
    inpict=imfilter(inpict,h);
end

nft=length(inpict(:,1))*2-2;
h=nft; 
[x,t]=istft(inpict,h,nft,samplefreq);
[stft, f, t_stft]=stft(x, nft, h, nft, samplefreq);

figure(1)
subplot(1,2,1); plot(t, x);
subplot(1,2,2); imshow(flipdim(real(stft),1));
cmap=colormap('gray');
colormap(flipud(cmap))

xp=x/(max(abs(x))*1.0001)*volume;
wavwrite(xp, samplefreq, 32, outwavname)

Multiple rows of text (one is flipped) mixed into a song
Of course, I only noticed after the fact that I had reinvented the wheel.

No comments:

Post a Comment