Resource: Primer on working with data in images
(moved from original post so I can continue with updates)
I wanted to give a quick primer on what we’ve learned about pulling data from images in photoshop (or other image application), as I don’t think that’s been covered well and many people get stuck there.
Note: I am currently using Photoshop CS5
EDIT/NOTE Nov 24 Recent puzzles have had data in both the alpha layers and color channels. This is causing issues for those (like myself) who have been using imaging software to get at the data. I’m leaving the original information below in case we need it, but adding a Part 4 below for extracting the channels directly.
// Part 1: The goal (RAW)
For those who understand the rest already, let’s start here. If you don’t, just note that this is the final step most of the time.
To get binary data for some other format that’s embedded in an image, ultimately we are trying to get an 8-bit, grayscale image (or multiple channels of grayscale data in one case here, more on that later) of the data portion of the image (usually “noise”). Once you have that, all those grayscale values correspond to a number between 0 and 255 (00-FF hex). Perfect for a binary file.
To get this into a file, you want to save it as a Photoshop RAW file (.raw). If you are dealing with only one channel or using the alpha layer, you usually want to be in grayscale mode (Image > Mode > Grayscale) before you actually save.
Once you have done this, you will need to open the raw file in a hex editor to determine what type of file it is, and remove any padding left over from the original image.
If it looks like the file is backwards, go back to the image and rotate it 180 degrees (common). If it still looks like garbage, don’t forget to try inverting your grayscale image before saving it (black <> white)
// Part 2: Alpha (Transparency) Layer
The first image found in this package is a good example: concatenate.png
Note that it’s a png, which is a lossless image compression format. This isn’t going to work with a “lossy” compressed image like most jpgs because the compression would corrupt the information.
To get the alpha layer out in photoshop, this is the techinque most of us have used. There may be better options in this and other programs:
- Load the transparency information a selection: Select > Load Selection > Layer 1 Transparency as new Selection
- Create a new Layer
- Fill the selection with White
- Select the whole layer and copy
- Create a new image, it should default to the dimensions of what you copied. Set the background as black or fill it after.
- Paste your selected information as a new layer
This should give you black and white noise.
- Flatten the image
- In the case of this image rotate it 180 degrees
- Make sure you are in 8-bit Grayscale mode
- Save as RAW per above
// Part 3: Color Channels
The next image found contained data in the color channels: alpha_channel_180_rotate.png
This has information in two channels and a hint the third (sheesh) but for now let’s assume there is only one to deal with.
To determine which channel has the data, go to the channels pane and make sure only one channel is visible and selected. Then:
- Delete the other channels you are not interested in.
- Convert the remaining channel to grayscale
- Crop as close to the data as you can. You will probably have a line with some data, and some of the original image.
- Select the image portion of the line and fill it with Black so you can identify it as padding (zeros) later. If you have trouble figuring out where the data stops, you can go back to the original color image and do it at that point.
- Save as RAW per above
Alternatively, and faster:
- Crop and fill image padding with black first
- Delete the two channels you are not interested in
- Save as RAW at this point
Now, for the image above it gets more confusing. The data was interleaved in two channels, red and green. The hint was in the blue channel - 7 bytes at the end that spelled SHUFFLE if you rotated the image 180 degrees. So there were two hints here:
A) Rotate the image 180 first
B) Interleave the data of the two channels.
Props to Furious for figuring out how to do (B). I had no idea this could be done. Here are the steps for this image:
- Crop and fill the padding with black while you are still looking at the RGB image.
- Rotate 180 degrees if you haven’t already
- Delete the blue channel
- Save as RAW, but don’t convert to grayscale, and since you have two channels, it will give you the option to interleave them. Do so, and you will get the RAW file you need.
// Part 4: Direct extraction of the RAW channels
Since photoshop and other applications integrate the transparency information into the individual color channels, this can effectively corrupt your RAW output, because the color channel is offset by the values of the alpha channel. In some cases you can correct for this, but generally it’s a big headache.
Huge thanks go to SepheusIX for the following.
You can extract original RAW versions of all four channels of the PNG with a Ruby script and an additional module called ChunkyPNG. To get this set up:
- Install Ruby: Ruby Programming Language
If you are on Windows the latest installer can be found here: http://rubyinstaller.org/
Note: for the Windows installer, to make your life easier, choose to add ruby excutables to your PATH and associate ruby file types.
- Install the ChunkyPNG module:
Go to the command line and type ‘gem install chunky_png’
- Save the ruby script for extracting channels:
The script can be found here: http://pastebin.com/WwpM3KNh
Save the script as something like extractchannels.rb
- Place the png you want to process in the same folder as the script and rename it to ‘source.png’
- Run the script. In Windows, if you associated .rb files with ruby during install, you can just run it from explorer
- It should chug on the file for a little bit and output four raw files in the same folder for alpha, red, green and blue in raw format.
// Side Note
You can use Photoshop like this in reverse when you are trying to look at data and see patterns in the data. Open a binary file in photoshop as a raw file (you’ll have to pick some dimensions, something equal or smaller than what the file could be were it a real image). Looking at the lines of gray, sometimes you can pick out patterns in the values - solid grey for consistent data, repeating patterns, sudden changes.
Many thanks to all those who have provided help and instruction along the way.