File attached: a powerpoint presentation (this is forensics after all...)
The file contained slides with useless base64 comments and images which also contained useless base64 comments (cf. http://forensic-proof.com/archives/495).
However, on slide 3 there are some interesting cells: AC64 and AC65 (and indirectly empty cells AC21 and AC22). Both cells contain base64 blobs. The base64, once concatenated can be decoded to obtain a 27461 Bytes file:
01 08 31 08 31 80 30 01 08 31 08 31 80 34 01 08 |..1.1.0..1.1.4..| 31 08 31 80 30 22 31 e7 99 62 11 c4 01 31 f7 91 |1.1.0"1..b...1..| 11 81 d0 40 30 f7 99 20 00 d4 82 21 f7 99 72 91 |...@0.. ...!..r.| d0 85 11 f7 91 11 a1 54 40 30 e7 18 20 00 c0 e7 |.......T@0.. ...| ...
Interestingly, 27461 is the product of 7 and 3923 which are both prime numbers. This is interesting because the hexadecimal output seems to be 7 bytes aligned. This seems to stand out at least for the first 3 "packets":
01 08 31 08 31 80 30 01 08 31 08 31 80 34 01 08 31 08 31 80 30
As a matter of fact, this signature is not referenced in any signature database that I know...
So let's try to split the hexadecimal stream into 7 bytes packet and compare them with each other:
- Identical packets:
Few hundred packets appear several times, up to 11 times for "3C BC 73 FD CA AA 0C". This is clearly not a compression format based on a dictionary...
When trying to search for 7 bytes (or 56 bits) aligned file/packet format, the results are not really helpful.. too bad.
- Most recurring byte at a given position:
When browsing the full 7 bytes aligned version of the file, a recurring byte I noticed was 0xF7 at the 3rd position. Not because it is the most recurring value for that position, but because it was nearly always followed by 0x99 or 0x91. In each of those cases the first quartet of the packet was even. This path won't lead us anywhere but it confirms the idea that these 7 Bytes packets really have a meaning.
Trying to determine the exact reason why some bits are correlated inside a packet is not really interesting and probably not feasible. However, trying to determine the range inside the packets in which certain bits are correlated may help splitting the packets to a lower granularity so that we can compare it with an existing format. In addition, writing a quick and dirty code for comparing block of bits is fairly easy.
As a matter of fact, a binary sequence is present at the same place in all packets, they all end in b'00, and more interestingly, even packets end in b'100 while odd packets end in b'000. At this point we may want to call these "7 Bytes packets": 54 bits frames.
This time we can try to search the web with: "54 bits" frame "synchronization bit"
The results will lead us to Linear Prediction algorithms for speech encoding.
Knowning this, if we look closer at the first 3 sequences of 7 Bytes, we find they are actually the initialization values for the speech synthesizer filter. So in practice, the signature should have been easy to identify with just this information...
A deeper analysis of the "correlated bits" found above gives LPC at the 10th order as a likely candidate, this is good news because it is supported by
The resulting sound spells an hexadecimal stream using the NATO Phonetic Alphabet (you should be able to decode this one pretty easily):
a4 b0 b0 ac 76 6b 6b a1 aa 9f b5 9f a8 ab ac a1 a0 a5 9d a0 ae 9d a9 9d b0 a5 9f 9d 6a 9f ab a9 6b 89 a5 9f a4 a1 a8 a8 a1 9b 89 9d a0 a5 a3 9d aa
Substracting 0x3C from each byte gives us: http://encyclopediadramatica.com/Michelle_Madigan (valid link in 2011: http://encyclopediadramatica.ch/Michelle_Madigan)
After reading that page, I think that Yes, she got pain when she tried to get gain. At least it seems so...
Key: Michelle Madigan (?)