logo

Untitled Document

Pal Mickey reverse-engineering

WIP. Only notes for now.

Pal Mickey was an interactive talking plush toy sold in the US Disney parks from 2003 to 2008. More details about its story and the different versions can be found on wikipedia.

The interactions are two-way: the user can press various parts of the toy to trigger modes and answer trivia questions, making it autonomous and usable as a regular talking toy. It can also react to and store data sent by fixed infrared transmitters that were hidden all over the theme parks, providing location-based information to users such as show and queue times, as if the toy was aware of its surroundings. So magical.

The IR beacon idea is clever and the implementation was one of a kind, but the technology itself wasn't exceptional for 2003.
The really noticeable technical part is the amount of voice data stored in the toy.

The infrared sensor and protocol didn't allow live transmission of audio due to bandwidth and line-of-sight constraints, so everythink Pal Mickey says is stored in non-volatile, static memory. Speech is composed live by an embedded program which assembles phrases or words. Probably because Mickey's voice and complex intonations would have been a challenge to synthesize down to the phoneme level, recorded and compressed samples were used.

The first surprise was the number of COBs found on the board.
I was only expecting an MCU and external memory, maybe an power amp for the speaker, nothing else.

The smallest one (U3) has traces going straight to the speaker, so it's an easy one: it's the speaker amp.
The one on the carrier board is wired to the serial EEPROM, the connector to the nose and hands (I/Os), and a 32.768kHz crystal, so it's probably the MCU.
The largest one has a silkscreen marking in the shape of a TSSOP package, so it has to be the memory.
I initially thought that U2 was a dedicated speech decoding chip since it was directly connected to the suspected memory and had its own 455kHz resonator, but tracing the analog audio path from the speaker amp showed that the source was U1 ! What could U2 be...

Observing signals found on the many test points scattered around the suspected memory while the toy is talking showed surprisingly sparse activity, in short bursts, with longer inactive periods matching silences in the speech. After probing other points for confirmation and finding very similar patterns, I was reminded of the data access patterns of the LPC speech ROMs used by TI's Speak and Spell, where silent frames are simply coded as delays with a few bits indicating the duration. It was clear that a dedicated speech compression algorithm was in use.

ADPCM and other simple, low efficiency compression algorithms need a constant stream of data at a fixed rate.
MP3 was already around since a few years but isn't specifically well suited for low bitrate voice, and doesn't encode silence as no-data delays. Early decoding chips might also have been prohibitively expensive for a toy.
Telephony codecs could have been a sensible choice, but again, Mickey's voice might have been a challenge to reproduce accurately.

Searched for speech playback chips, only found modern "voice MCUs" that used serial flash orMP3-based chips.
Looked for evolutions of TI's LPC chips but didn't find much that sounded high quality enough. I kind of lost hope, thinking it must be an obscure IC that was only sold in bare die form, never listed on any website, from one of the myriads of Taiwanese fabless vendors of the time, who only sent datasheets by fax after a phone call proving one wanted to buy 200.000 units...

To go any further I'd have to identify the chips by looking at their silicon markings, which meant removing the COB encapsulation and destroying the bond wires. In hindsight, I came VERY close to guessing which MCU it was. I should have spent a bit more time on archive.org.

After months of eBay alert e-mails returning unrelated crap, I had a chance to buy a second reasonably priced Pal Mickey .
I compared its physical state with the one I already had to decide which one was to be sacrified for science.

I removed most of the COB epoxy with 400°C hot air and a scalpel, removed the dies from the board, and removed the attached material with hot sulfuric acid.

The largest one is indeed parallel memory. An OTP device from AMIC (AM9331A), which I haven't found a datasheet for.
Its pinout perfectly matches a standard 512k*8/256k*16 (4MB) JEDEC chip used in byte mode however.

The mysterious U2 is a small, square die without any markings around the edges. A closer look led to the discovery of a Sunplus logo and a "PU5165" marking. 44 pads, fully digital, no memory, very little logic.
It's actually a Generalplus GPBA01B bus extender, the datasheet even has a die picture that perfectly matches.

Not sure why it needs the 455kHz resonator. Doesn't seem to run ever, not even to wake up the MCU.

And finally, the MCU turned out to be a Texas Instruments MSP50C604, a special 16-bit mixed-signal MCU with DSP functions dedicated to speech playback, with a 64kB ROM and 16 IOs, explaining the need for a bus extender to allow the use of external parallel memory.

The datasheet and the user guide are available, but the mentionned software tools aren't. TI's customer service say they don't have anything left.

The user guide interestingly mentions that the readout protection flags are read on startup by the fixed section of the firmware, so a glitching attack might be possible. However, without the protocol documentation of the proprietary JTAG-like debug/programming interface, I can't even know if the firmware is locked or not.

Dumped the external memory by keeping the MCU in reset and took advantage of the bus extender to connect fewer wires: download.

Geoff Martindale appears as the author of the example code found in MCU's user guide.
His LinkedIn profile shows that he might have sparked the Pal Mickey idea: "Developed TI-TALKS speech synthesis reference code" [...] "Presented to VPs of Hasbro, Mattel & Disney at Toyfair 2000".
Kind of creepy but getting in touch with him might be the only way to get a hold of the original dev tools.

 

Stuff not related to speech playback:

Connector pinout, left to right on pictures above:
Brown: Nose, IR receiver signal active low
Red: Nose, ground for IR receiver, periodically disconnected for power saving
Orange: Nose, IR receiver power, constant 3.3V
Yellow: Nose, IR LED cathode (signal)
Green: Nose, IR LED anode, constant 5V
Blue: Hand switch A
Purple, grey: Common 3.3V for hand switches, periodically enabled
White: Hand switch B

Whatever IR receiver is in the nose is connected to a fixed 3.3V line via the orange wire.
The red wire is connected to ground during 1.5s every 3s to power the sensor periodically.
Brown is the active low IR signal post-demodulation.

Patent US8157610B1 mentions TSOP1838 receiver, 38kHz carrier, 2400 baud 8N1 data.
Any carrier below 35kHz or above 39kHz results in no signal. Best seems to be 38kHz.
Data signal 50% duty cycle above 2.4kHz starts to cause dropouts.

IR transmitted when pressing hand is 2325bps 8N1 on a 38kHz carrier.
Carrier during 8ms, off during 10ms, then 55 AA 05 03 04 00 00 0C.

55 AA is probably a header.
05 03 04 00 00 is probably command / parameters.
0C is a simple checksum of all bytes except header (05 + 03 + 04 + 00 + 00).
Whole frame is repeated four times with 15ms pauses.

Each time the hand is pressed, the frame changes:

Counts up with xx = 00 to 05: 55 AA 05 03 xx 00 00 cc (55 AA 05 03 00 00 00 08 to 55 AA 05 03 05 00 00 0D).
Then 55 AA 05 03 0A 00 00 12
Then 55 AA 05 03 0C 00 00 14
Then 55 AA 05 03 0D 00 00 15
Then 55 AA 05 03 17 00 00 1F
Then back to 55 AA 05 03 00 00 00 08

Tried transmitting thousands of random frames with the same format but no reaction.

footer
symbol symbol symbol symbol symbol