If your organization uses Unicode filenames, scripting shortcuts with Windows Script Host's (WSH's) WshShortcut object can be an exercise in frustration. What's particularly irritating is that this object allows you to write Unicode text to any of its properties without any fuss, but when you attempt to save the shortcut, WshShortcut either throws an error (if the shortcut's filename or path contains Unicode) or silently mangles the Unicode into garbage (if any other property contains Unicode).
Adding one final insult, WshShortcut returns the same generic error message—unable to save file <filepath>—for any refusal sent back by the Windows file system. The Unicode characters in the file path will be replaced with a question mark (?), making the error message even more confusing.
Fortunately, there are ways to work around these problems, which I'll explain. First, though, I'll shed some light on why WshShortcut exhibits these annoying behaviors.
Why WshShortcut Is Unicode Illiterate
The reason why WshShortcut accepts Unicode is easy to explain: It has to. Quoting from Microsoft's own COM interface design rules: "All string parameters in interface methods must be Unicode" [http://msdn.microsoft.com/en-us/library/ms692709(VS.85).aspx]. In WSH, all text written to a file (or another type of object) is represented internally as Unicode. (WSH and its scripting languages always use Unicode internally.) This pushes string conversion problems down to the level of the components that produce the output in a particular representation.
So why does WshShortcut throw errors and mangle content even though you're allowed to use Unicode in shortcuts? There's no official explanation for this, but it makes sense if you understand the historical context of WSH.
The WSH helper objects, including WshShell and its member objects such as WshShortcut, were designed when Windows 98 and Windows 95 (both singularly lacking in Unicode file-system support) dominated the desktop. At that time, three simplifications were made:
- The designers wanted the WshShortcut object to function on all supported platforms. The easiest way for the same codebase to work identically on all systems is to use non-Unicode APIs, so that's what WshShortcut uses.
- Instead of inspecting the text it's given, WshShortcut simply assumes that it's ANSI text represented in Unicode. If you supply ANSI text to the object, there's no problem. When ANSI text is represented in Unicode, the first byte in each pair of bytes is empty, and WshShortcut's technique of chopping the first byte of each byte pair is no problem. However, when you supply Unicode characters, you now only have the representation of the last half of each character. This is essentially like chopping off the first half of the numbers in a street address. The address 8922 North Main becomes 22 North Main, which is not only wrong, but also might not even correspond to a real address. Thus, when text that needs Unicode representation gets treated in this manner, you get garbage.
- WshShortcut simply generates a binary shortcut file as raw bytes that will be written to disk. There are no checks to determine whether the data written is nonsense or will cause problems. In the case of the shortcut's pathname, the file-system APIs reject most nontext characters—and a mangled Unicode filename is highly likely to include some illegal characters.
At this point, we're stuck with WshShortcut' shortcomings. Any chance of WshShortcut being updated to support Unicode vanished when WSH was frozen in 2001. However, as I mentioned previously, there are several workarounds for making Unicode-friendly shortcuts.