My Photo
Location: Downham Market, Norfolk, United Kingdom

Friday, September 16, 2016

It must sometimes seem that I only write a new blog post when I'm frustrated by some feature of Windows, and I'm sorry to say that's the case once again! It turns out that this is a longstanding issue, but it's one that has only been brought to my attention recently by a report of a surprising behaviour in LB Booster: pressing the Escape key causes the window to close unexpectedly, but only if it incorporates a TEXTBOX or TEXTEDITOR control which has the input focus. Superficially this is very peculiar.

I am grateful to Anatoly Sheyanov for having discovered the true cause, which is - to quote Raymond Chen of 'The Old New Thing' fame - "when you hit ESC in a multiline edit, it posts a WM_CLOSE to its parent in the mistaken belief that it is part of a dialog box.  This is for compatibility with Windows 2.0, which behaved this way.  It's annoying now, but compatibility means having to be bug-for-bug compatible with the previous version". This was written back in 1999 but appears to be still true today.

My frustration stems from the obvious fallacy that this bug needed to be preserved - and still does - "for compatibility with Windows 2.0". It would have been straightforward for Microsoft to introduce a new window style bit (or extended style bit) which, when set, disabled this rogue behaviour and instead caused the Edit Control to send the usual WM_COMMAND message containing the IDCANCEL code. Since applications written for Windows 2.0 shouldn't be setting this undefined bit they would carry in working as before, but new applications could set the bit and get the expected behaviour.

Another approach would have been to fix the bug in the Edit Control but to arrange that Windows applies a compatibility shim unless the program contains a manifest declaring that it isn't needed. This is the standard way that Windows addresses compatibility issues that might arise from the introduction of a new feature or changed behaviour of an existing feature. After all, back in the Windows XP days 'Version 6 Common Controls' were introduced, which were significantly different from their predecessors. To take advantage of them a program needed to declare compatibility in its manifest, so the 'Escape sending WM_CLOSE' bug could have been fixed at that stage.

Unfortunately there aren't any very straightforward workarounds for this bug. The only foolproof solution is to subclass the Edit Control and prevent the Escape keypress even reaching it, which is messy and may be impractical in some languages. The bug is an annoyance and an inconvenience which could have been entirely avoided as long ago as 1999, or before, had Microsoft only chosen to do so.

Sunday, May 19, 2013

Arguably, the most serious incompatibility between LB Booster and 'genuine' Liberty BASIC is the lack of support for huge strings. The maximum string length in LB4 isn't quoted anywhere, as far as I know, but is probably limited only by the memory available to BASIC, which is usually stated as being in the region of 70 Mbytes (although presumably one shouldn't expect it to work properly if a single string uses a significant proportion of that memory). In any case it is a lot bigger than BBC BASIC's 64 Kbytes (65535 characters) limit.

The ability to have multi-megabyte strings isn't necessarily desirable, because it can encourage the use of inappropriate algorithms. For example suppose one wants to count the number of lines in a text file. The traditional way, and the 'natural' way in a language which doesn't support huge strings, is to read the file sequentially, incrementing a counter for each line read. Only one line needs to be buffered in memory at a time, and the process is quite efficient.

In a language like Liberty BASIC, however, there is a temptation to read the entire file into a string - because you can - and then to scan the string for line delimiters (for example using the INSTR function) in order to count the lines. That will work, but it is highly inefficient of memory usage and CPU time. Firstly it involves holding the entire file in memory, which may be many megabytes, and secondly the INSTR function may need to scan the string from the beginning every time (or at least make a copy of the string first).

In a simple benchmark using LB 4.04, counting the lines in a 1 Megabyte text file with the first (sequential read) approach took about one second and with the second (whole file loaded into a string) approach took about 130 seconds. Repeating the test using LBB (for how that was possible, read on!) gave timings of 0.2 seconds and 20 seconds respectively. This illustrates the kind of speed improvement possible by using LBB, but in both cases loading the entire file into a string was about one hundred times slower than the sequential approach!

However, whilst it is possible to argue that supporting huge strings isn't always desirable, a significant proportion of LB programs do rely on that capability. To improve compatibility there is no alternative but for LB Booster to do the same. In investigating how that might be possible I first looked at the practicality of adapting the HUGESLIB library, which provides limited support for huge strings in BBC BASIC for Windows. Although in principle having the required functionality, it would involve identifying which string variables need to be able to hold huge strings (in itself non-trivial) and then substituting calls to library functions whenever those strings are manipulated.

I eventually decided that such an approach would be too complicated and too likely to suffer from obscure bugs. The only other possibility was to adapt BB4W itself to handle huge strings natively. Whilst clearly far from trivial, in fact such a modification wasn't desparately difficult for a couple of reasons. Firstly, I had already tackled increasing the maximum string length once before, when it was increased from 255 bytes (the limit in Acorn versions of BBC BASIC) to 65535 bytes, and the techniques used on that occasion are, in principle at least, fairly readily extended to support even longer strings. Secondly, because the 64K limit implied the string length being stored as a 16-bit quantity, the fact that the CPU's native registers are in fact 32-bits made it quite easy to adapt existing code (typically where the string length had previously been stored in the 16-bit CX register it could instead be stored in the 32-bit ECX register).

So it was that I modified the BB4W interpreter to support huge strings, specifically for the benefit of LBB. It's quite likely that the resulting performance is not as good as it might have been had huge strings been anticipated from the start, since in some cases the algorithms don't scale too well, but assuming that most strings will still be shorter than 64 Kbytes a typical program shouldn't experience much impairment. The version of LB Booster with this modification is, at the time of writing, in a beta test phase and is what was used to run the benchmark program.

Of course this raises a particular difficulty in respect of BB4W - should I make the 'huge string' version available to BBC BASIC programmers as well as to Liberty BASIC programmers? That is a much more tricky proposition, because unlike LBB - where the huge strings improve compatibility - the change from 16-bit to 32-bit lengths will inevitably impair compatibility and break some existing BBC BASIC programs. For example programs which assume the 'string descriptor' is 6 bytes long won't work when it increases to 8 bytes, and programs which include assembler code to manipulate strings almost certainly would need to be changed.

So this is quite a dilemma for which I haven't found a satisfactory resolution. I could simply not make the huge-string version available to BBC BASIC programmers at all, but that would seem rather unfair. Or I could upgrade BB4W and document the program modifications necessary (it's happened before, after all, when the strings were changed from 255 to 65535 characters). Or I could somehow make both versions available, but that greatly complicates version control, maintenance, automatic upgrades etc. There's no perfect answer!

Sunday, September 11, 2011

"If you can't beat them, join them" is what they say, and that's exactly what I've done in respect of Liberty BASIC! In an earlier blog post I compared BBC BASIC for Windows and Liberty BASIC, and explained my reasoning for believing BBC BASIC to be the better of the two. But that's not to say I can't see the attraction of LB to some people, particularly in having support for Windows GUI features as a standard part of the language rather than being provided in a library.

I've argued before, and still very much believe, that building in a rich set of GUI features as standard can result in a bloated language. This is especially true in the case of an interpreted language when every 'compiled' executable needs to include support for all those features, whether the user's program needs them or not. I've always suspected that this explains, at least in part, the huge size of Liberty BASIC executables (not far short of 3 Megabytes, however small the program).

But whilst the principle of providing GUI support by means of libraries (which is what BBC BASIC does) is sound, in practice there hasn't actually been support for the full range of facilities provided by Liberty BASIC, at least not in a straightforward way. Certainly there are BB4W libraries for the main Windows GUI elements, but LB goes somewhat further, for example you can change the colours of standard Windows controls (whether that's a good idea is arguable) and you can arrange that buttons are automatically relocated when a window is resized.

So I've recently remedied this by writing a BBC BASIC for Windows library LBLIB which makes available the full range of LB features to a BBC BASIC program. Indeed, it does so as far as possible in a way that is fully compatible with Liberty BASIC, and this has made possible a brand-new product: Liberty BASIC Booster!

But I'm getting ahead of myself in the story. Some while ago I wrote an automated translator from QBASIC (or Quick Basic) to BBC BASIC - QB2BBC. Unlike many translators this attempts to perform a 'perfect' translation that requires no user tweaking to achieve identical functionality to the original. The down-side of producing such an accurate translation is that the code it generates is often quite unlike what a human would write, and it doesn't give a good indication of what well-written BBC BASIC code looks like.

At the heart of QB2BBC is a parsing engine which breaks down the QBASIC program to its fundamental elements and then constructs the BBC BASIC program from them. It occurred to me some time ago that this parsing engine could be adapted quite easily to translate other languages into BBC BASIC, one obvious candidate being Liberty BASIC. So earlier this year I started to look quite seriously at the possibility of writing a Liberty BASIC to BBC BASIC translator (tentatively to be called LB2BBC).

But of course there was a big problem. With QBASIC originating from the 1980s, in the era of CP/M and MS-DOS, pretty much every feature it provides has a direct or near equivalent in BBC BASIC. Whilst not always straightforward, the task of QB2BBC was primarily one of translation rather than emulation (the only major exception being QBASIC's music capabilities, which are emulated at run-time by the QBLIB library). With Liberty BASIC however, translating the language syntax to BBC BASIC was only a small part of the challenge; it left the GUI facilties (principally windows, controls, graphics and sprites) completely unsupported.

This is where we complete the circle. By combining the translation capabilities of LB2BBC with the emulation capabilities of LBLIB it becomes possible to run most Liberty BASIC programs, without actually needing Liberty BASIC itself! In consultation with some leading Liberty BASIC advocates it became clear that there might be a market for this, not as as translator to BBC BASIC but as a 'Liberty BASIC Booster'. To the user it would appear as a way to speed up LB programs (up to ten times in some cases) and to create compact single-file executables, without it being obviously apparant that BBC BASIC was at work 'under the hood'.

The compatibility of LBB with LB isn't perfect. As I noted at the end of my article comparing the two BASICs there are two areas in which LB provides a capabilility that BBC BASIC doesn't: support for huge strings and arbitrary-precision integer arithmetic. Therefore programs relying on either of these features cannot run under LBB.

Some may find it superficially strange that I should be supporting Liberty BASIC in this way. One reason is that whilst LBB doesn't directly benefit BBC BASIC, neither does it significantly damage it: I wouldn't expect that any users of LBB might have been persuaded to convert to BB4W instead. Secondly, LBB demonstrates the amazing power and flexibility of BBC BASIC for Windows; with how many other high-level programming languages could it have been achieved? But mostly it's been a fascinating technical challenge, pushing my programming skills to the limit.

Thursday, July 22, 2010

On the same topic as last time, there's another good example of how assembling code at run-time can provide a worthwhile advantage. The x86 processor's division instructions are particularly slow, even on the newest CPUs, and it's common for modern compilers to convert division by a constant to a multiplication followed by a shift. If the multiplicand (which is effectively the reciprocal of the divisor, multiplied by a power of two) and the shift count are chosen carefully - and in some cases a final adjustment made - the result is identical to the division instruction but the speed is much faster.

The rub, of course, is that this technique is normally only applicable to division by a constant which is known at compile time. If you don't know what value you want to divide by until run-time you've got to use a conventional division instruction (strictly speaking if you know the divisor is one of a set of predetermined values you could select from a matching set of routines, or look up the multiplicand and shift count from a pair of tables, but that could eat up most if not all the speed advantage).

There's no such limitation in the case of BBC BASIC. Because the code is assembled at run-time the multiplication-plus-shift trick can be used for any divisor, so long as it remains unchanged for that particular run. For example you might want to divide by the screen width, a value which is constant at run-time but usually not known at compile time. The application would need to contain code for calculating the multiplicand and shift count from the divisor, but that is used only once during the initialisation phase of the program (in practice one would probably use an 'assembler macro' to do it).

Wednesday, December 30, 2009

BBC BASIC allows you to incorporate assembly language code in your BASIC program, typically for use when interpreted BASIC isn't fast enough or for specialised purposes such as coding interrupt service routines. Using a hybrid of BASIC and assembler code is a powerful technique, allowing you to use BASIC for the bulk of the program (typically including the user interface) and assembler for the speed-critical parts (which are likely to be a relatively small proportion of the whole).

Needless to say the assembler provided in BBC BASIC is specific to the particular processor on which it runs. So the original BBC Micro BASIC incorporates a 6502 assembler, Acorn's BASIC for RISC OS machines incorporates an ARM assembler, BBC BASIC (Z80) (e.g. for CP/M or on the Z88) incorporates a Z80 assembler and BBC BASIC (86) for MS-DOS incorporates a 16-bit x86 assembler. BBC BASIC for Windows incorporates a full 32-bit x86 (IA-32) assembler, including support for floating point and MMX instructions.

Whilst support for 'inline' assembly language code is not unusual in programming languages (for example it's a standard feature of C), BBC BASIC provides it in a unique way. Conventionally, any assembler code included in a program is converted to machine code (i.e. assembled) at 'compile' time and incorporated in the final executable as binary data. In contrast, BBC BASIC incorporates the assembler 'source' code in the executable (usually in a compressed format) and assembles it to machine code when the program is run.

Let that sink in for a moment. If the source code is assembled at run-time, it means the BBC BASIC executable must itself contain an assembler - which indeed it does (albeit a small and very fast one). As far as I know no other programming language creates executables which contain an assembler, let alone a full IA-32 assembler.

In many cases this very different way of dealing with embedded assembler code won't have any practical consequences - the end result will be equivalent to the more conventional method. However it makes possible some things which would otherwise be very difficult or messy to achieve. In particular it allows the assembled machine code to vary according to some condition which is determined at run-time.

Why might that be valuable? Suppose, for example, that the assembler code accesses one or more memory locations, which means that the addresses of those locations must be known. Typically, those addresses will vary from one run of the program to the next, or between one PC and another, so they aren't known when the executable is created (i.e. when the program is 'compiled'). Conventionally this means that the addresses can't be incorporated as 'constants' within the assembler code, instead they must be passed to the code as parameters (e.g. on the stack) when it is run.

However in the case of BBC BASIC things are very different. The assembler source code is converted to machine code at run time, so it can usually be arranged that the addresses are known by then. As a result, the 'variable' memory addresses (which change from one run to the next) can be incorporated in the assembled machine code as 'constants', making the code both simpler and faster.

To take a practical example, suppose the assembler code needs to load the eax register from a memory address that is known only at run-time. In a conventional programming language the assembler source code will look something like this (where the address is passed to the assembler code on the stack):

mov ebp,esp ; get pointer to stack frame
mov ebx,[ebp+offset] ; get address
mov eax,[ebx] ; load eax from memory

But in the case of BBC BASIC the code can simply be as follows, because the address is already known:

mov eax,[address] ; load eax from memory

If there are many such memory references the impact on execution speed can be considerable. Similar considerations may apply whenever the code can be simplified by incorporating values as 'immediate constants' rather than 'variables'.

There's another valuable benefit of BBC BASIC assembling source code at run-time. In Microsoft Windows different processes run in separate (virtual) address spaces and are effectively quite well isolated from each other (one process cannot see another process's memory directly). However it is possible for one process to 'inject' code into another process and then run it there. The snag is that the address in the remote process at which the code must be loaded, and executed, isn't known until run-time.

This poses quite a problem for a conventional language, as IA-32 machine code isn't (in general) 'position independent' and so must be assembled (and linked, if appropriate) to run at a specific address. Typically the assembler code has to be written in such a way that it can be 'relocated' at run time - by 'patching' it using, for example, C code - according to the address at which it will eventually be loaded in the remote process. This is potentially very messy, prone to error and difficult to debug.

With BBC BASIC this is all very much easier. Because, by the time the code is assembled, the address at which it must eventually run is known, there is no requirement for it to be 'patched' or 'relocated' (the BBC BASIC assembler has a specific feature whereby the code can be assembled so that it will run at an address different from that at which it is initially stored). This makes it exceptionally well-suited to injecting code into another process.

Saturday, November 08, 2008

There's little doubt that the main competitor to 'BBC BASIC for Windows' is Carl Gundel's 'Liberty Basic'. Superficially the two products have a lot in common:

* They are both 'Windows only'.
* Their IDEs have very similar interfaces and capabilities.
* Both provide direct access to the Windows API and other DLLs.
* Both are interpreters (neither compiles to native code), in particular this allows them to support an EVAL function.
* Both use a 'run-time engine' to execute a tokenised version of the user's BASIC program.
* Both allow BASIC programs either to be executed directly from the IDE, or to be converted to a self-contained executable.

I have even been described in one blog as "a sort of British Carl Gundel"!

However when one looks under the surface there are in fact major differences between the two BASICs. Perhaps the most important is their contrasting approaches to providing capabilities beyond those of 'standard' (i.e. traditional) BASICs. Such capabilities include support for Windows GUI components (menus, dialogue boxes and so on), sorting of data arrays, sprite commands etc.

Liberty Basic builds these capabilities into the language itself. It therefore needs significantly more keywords than, and is more complex than, BBC BASIC. By contrast 'BBC BASIC for Windows' is a 'lean and mean' language: the interpreter provides only the core set of BASIC language features. Support for capabilities such as those mentioned above is provided by means of libraries (surprisingly, Liberty Basic doesn't support code libraries at all).

In this respect you can say that BBC BASIC has a lot in common with the C language, which also relies greatly on libraries (there are other ways in which BBC BASIC has similarities to C, which I will touch on later).

Perhaps the most striking consequence of this fundamental difference between the BASICs is in the size of their 'compiled' executables. The 'BBC BASIC for Windows' run-time engine is tiny (less than 64 kBytes) and its standalone executables - consisting of a single EXE file - are often less than 100 kBytes in size. Liberty Basic's executables, on the other hand, are never smaller than about 2.7 Mbytes (uncompressed) and consist of around ten separate files.

Carl Gundel, perhaps not surprisingly, defends the Liberty Basic executables on the grounds that modern PCs have so much memory and so much disk space that even 3 Mbytes isn't "large". However, compact executables really can have significant benefits, even on such systems.

For a start a small interpreter will tend to run more quickly, because it makes more efficient use of the CPU's instruction cache. Indeed BB4W's interpreter is so small that it will fit in its entirety within the instruction cache of some modern processors (e.g. the 64K cache of the AMD Athlon 64). This contributes to the marked speed difference between BBC BASIC and Liberty Basic (using a Sieve of Eratosthenes benchmark 'BBC BASIC for Windows' took 1.1 seconds and Liberty Basic took 16.6 seconds: fifteen times slower!).

The other big advantage of compact executables is speed of download via an internet connection. BBC BASIC's small, single-file EXEs can be placed on a web site and, with a couple of clicks of the mouse, can be running on the user's PC within a fraction of a second. There's no need to download them (manually) to the PC, nor to decide where to store them; Windows does it all behind the scenes.

I mentioned earlier that BBC BASIC has other similarities to C. These include its ability to access memory directly, at an arbitrary address, and the incorporation of an 'inline' assembler. Of course these are 'advanced' features, and most users will neither need nor want to utilise them, but in the hands of an expert they make it possible for BBC BASIC for Windows to do things that Liberty Basic could never do on its own.

An example is accessing 'object-based' APIs such as Direct3D, COM automation (ActiveX) etc. BBC BASIC can manage this quite easily, but Liberty Basic needs to rely on external DLLs (typically written in C or Visual Basic) to achieve it. Even things as simple as accessing an API function exported by ordinal rather than name are beyond Liberty Basic's capabilities.

Although BBC BASIC for Windows is much faster than Liberty Basic, there can still be occasions when it is not fast enough. This is where the built-in assembler can be invaluable; programs can be fairly easily written in which the few time-critical parts are written in assembler and the rest in BASIC, a hybrid approach which is extremely powerful and flexible.

To be fair, there are a couple of respects in which BBC BASIC is less capable than Liberty Basic: these are the maximum sizes of integer and string variables. In common with most modern languages BBC BASIC uses 32-bit integers whereas Liberty Basic supports 'arbitrary precision' integer arithmetic. BBC BASIC's strings are limited to 65535 characters, whereas Liberty Basic's strings can be much longer. In those very rare circumstances when these are important there are workarounds for both issues in BBC BASIC.

In conclusion I am quite convinced that BBC BASIC for Windows is a better language than Liberty Basic. It creates smaller, faster executables. It allows you to do far more things without recourse to using an external DLL. BBC BASIC is a real enthusiasts' language; whatever you want to do it will rise to the challenge.

Wednesday, September 26, 2007

I've recently discovered something fascinating about the inner workings of Windows which, whilst known by the cognoscenti, seems not at all well understood and even less well documented.

It is well known that messages posted to a window, e.g. by calling PostMessage, are queued and only get dispatched to the Window Procedure by means of a message loop (or message pump) calling the GetMessage (or PeekMessage with the PM_REMOVE option) and DispatchMessage functions. It is also quite well known that messages sent to a window from the same thread, e.g. by calling SendMessage, are not queued and call the Window Procedure directly. Note that this can easily result in the Window Procedure being called re-entrantly, for example when a message handler itself calls SendMessage. It also means that sent messages are never filtered by the GetMessage wMsgFilterMin and wMsgFilterMax parameters.

But what happens to messages sent to a window from another thread? They neither get queued (in the sense of the posted messages) nor do they call the Window Procedure directly, but instead they are dispatched to the Window Procedure within a GetMessage, PeekMessage or SendMessage function. This dispatch happens invisibly and automatically within the functions; GetMessage and PeekMessage will never return sent messages, but they nevertheless dispatch them!

This knowledge is invaluable for preventing deadlocks. Suppose Thread A receives a message, and within the message handler it waits on some event happening in Thread B. Suppose also that Thread B sends a message to Thread A and does not action the event until the SendMessage function returns. This is a recipe for a deadlock: Thread A is waiting for Thread B's event and Thread B is waiting for Thread A to respond to its message, which it cannot do because it is waiting for Thread B....

With our new-found knowledge the solution is simple. Whilst Thread A is waiting for Thread B's event, it calls PeekMessage. It needn't remove any messages from the queue, nor take any notice of what is returned from PeekMessage, simply calling the function is sufficient! Now the message sent from Thread B is dispatched (within PeekMessage), Thread A responds, and Thread B continues execution and actions the event. Result: no deadlock!