UPDATE: This sandbox script has evolved into the functions im2spectrogram() and text2spectrogram() in my aptly named Matlab Image Mangling Toolbox
A while back, I got bored and was looking for various ways to shove pictures where they didn't belong. Among simpler ideas like concatenation, arbitrary character encoding schemes, and spreadsheet conversion, I tried my hand at conversion to audio. Using
this particular STFT/ISTFT set of tools, as well as basic parts of the image processing toolbox, I threw together this kludge:
%% simplified automatic spectrogram obfuscation
clc; clear all;
format compact;
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
projdir='/home/assbutt/projects/imagepooper/';
inpict=imread([projdir 'test.jpg'], 'jpeg');
outwavname=[projdir 'soundpicture.wav'];
% typical image adjustments
invert=0; % invert if 1
flip=0; % flip horizontal if 1
bluramt=1; % gaussian blur amount (approx 0.5-10) (zero for no blur)
blurrad=3; % gaussian blur radius (approx 2-5)
alteraspect=0.80; % correct for viewer distortion
padbar=0.08; % relative height of top padding (H=1+padwidth)
volume=1; % adjust signal volume (will clip beyond unity)
% add blur to reduce bright edge artifacts
% use nonzero padbar to keep image below mp3 cutoff (0 for none)
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% desaturate color images (choose method if desired)
if (ndims(inpict) == 3); inpict=mrgb2gray(inpict,'default'); end
% invert where requested
if (invert == 1); inpict=255-inpict; end
% flip where requested
if (flip == 1); inpict=fliplr(inpict); end
inpict=flipud(inpict); % flip image (low-f on bottom)
inpict=imadjust(inpict); % set image contrast
magicnum=660;
inaspect=length(inpict(1,:))/length(inpict(:,1));
inpict=padarray(inpict,[round(padbar*length(inpict(:,1))) 0],'post');
inpict=imresize(inpict,magicnum*[1 alteraspect*inaspect/(1+padbar)]);
if (bluramt > 0);
h=fspecial('gaussian', blurrad*[1 1], bluramt);
inpict=imfilter(inpict,h);
end
nft=length(inpict(:,1))*2-2;
h=nft;
samplefreq=44100;
[x,t]=istft(inpict,h,nft,samplefreq);
[stft, f, t_stft]=stft(x, nft, h, nft, samplefreq);
figure(1)
subplot(1,2,1); plot(t, x);
subplot(1,2,2); imshow(flipdim(real(stft),1));
cmap=colormap('gray');
colormap(flipud(cmap))
xp=x/(max(abs(x))*1.0001)*volume;
wavwrite(xp, samplefreq, 32, outwavname)
Of course, one kind of has to guess at the transform size a person might use when configuring the script parameters, otherwise the arbitrariness of frequency-time scaling makes it basically impossible to enforce any first-view aspect ratio. Most photographs sound pretty terrible, though feeding it high-contrast images with little white content produce nicer audio sweeps.
|
We take this image |
|
And the script poops out an audio file with a spectrogram like this (using foobar2000) |
View your output with foobar2000, audacity, or baudline. The behavior could probably be improved, but this script is just a novelty. Nobody cares. I don't care anymore either.
At the time I made this, I also pooped out a version that encodes text strings as a marquee in the top end of the frequency spectrum. The output is a sound file containing the text. Simply mix one or more of these files with some music to obtain a song full of hidden inaudible text. I probably should've just made the core of these scripts modular, but I wasn't sure what parts would be common.
%% create text marquee for spectrogram
clc; clear all;
format compact;
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
instring='put a whole shitload of text here';
projdir='/home/assbutt/projects/imagepooper/';
outwavname=[projdir 'soundpicture.wav'];
% typical image adjustments
alteraspect=0.80; % correct for viewer distortion
textheight=0.03; % relative height of text
textlocation=19000; % frequency center of text
volume=0.1; % adjust signal volume (will clip beyond unity)
bluramt=5; % gaussian blur amount (approx 0.5-10) (zero for no blur)
blurrad=3; % gaussian blur radius (approx 2-5)
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
magicnum=660;
samplefreq=44100;
inpict=uint8(text2im(instring));
inpict=255*(1-inpict);
toplim=samplefreq*(1-textheight/2)/2;
botlim=samplefreq/2-toplim;
if (textlocation > toplim);
textlocation=toplim; disp('supramaximal text center');end
if (textlocation < botlim);
textlocation=botlim; disp('subminimal text center');end
inaspect=length(inpict(1,:))/length(inpict(:,1));
inpict=imresize(inpict,magicnum*textheight*[1 inaspect]);
botpad=floor(magicnum*((2*textlocation/samplefreq) - textheight/2));
toppad=ceil(magicnum*(1-textheight)-botpad);
inpict=padarray(inpict,[0 10],'both'); % pad ends
inpict=padarray(inpict,[toppad 0],'pre'); % top pad
inpict=padarray(inpict,[botpad 0],'post'); % bottom pad
inaspect=length(inpict(1,:))/length(inpict(:,1)); % recalculate
inpict=imresize(inpict,magicnum*[1 alteraspect*inaspect]);
inpict=flipud(inpict); % flip image (low-f on bottom)
if (bluramt > 0);
h=fspecial('gaussian', blurrad*[1 1], bluramt);
inpict=imfilter(inpict,h);
end
nft=length(inpict(:,1))*2-2;
h=nft;
[x,t]=istft(inpict,h,nft,samplefreq);
[stft, f, t_stft]=stft(x, nft, h, nft, samplefreq);
figure(1)
subplot(1,2,1); plot(t, x);
subplot(1,2,2); imshow(flipdim(real(stft),1));
cmap=colormap('gray');
colormap(flipud(cmap))
xp=x/(max(abs(x))*1.0001)*volume;
wavwrite(xp, samplefreq, 32, outwavname)
|
Multiple rows of text (one is flipped) mixed into a song |
Of course, I only noticed after the fact that I had reinvented the
wheel.