Misunderstanding Computers

Why do we insist on seeing the computer as a magic box for controlling other people?
人はどうしてコンピュータを、人を制する魔法の箱として考えたいのですか?
Why do we want so much to control others when we won't control ourselves?
どうしてそれほど、自分を制しないのに、人をコントロールしたいのですか?

Computer memory is just fancy paper, CPUs are just fancy pens with fancy erasers, and the network is just a fancy backyard fence.
コンピュータの記憶というものはただ改良した紙ですし、CPU 何て特長ある筆に特殊の消しゴムがついたものにすぎないし、ネットワークそのものは裏庭の塀が少し拡大されたものぐらいです。

(original post/元の投稿 -- defining computers site/コンピュータを定義しようのサイト)

Monday, June 12, 2017

Reinventing computers.

I mention my bad habits a bit, but I don't really go into much detail:
One of these days I'll get someone to pay me to design a language that combines the best of Forth and C.
Then I'll be able to leap wide instruction sets with a single #ifdef, run faster than a speeding infinite loop with a #define,
and stop all integer size bugs with my bare cast.
I recall trying to get a start ages ago, on character encoding, CPUs, and programming languages, and, more recently, more on character encoding.

These are areas in which I think we have gone seriously south with our current technology.

First and foremost, we tend to view computers too much as push-button magic boxes and not enough as tools.

Early PCs came with a bit of programmability in them, such as the early ROMmed BASIC languages, and, more extensively, toolsets like the downloadable Macintosh Programmers' Workbench. Office computers also often came with the ability to be programmed. Unix (and Unix-like) minicomputers and workstations generally came with, at minimum, a decent C compiler and several desktop calculator programs.

Modern computers really don't provide such tools any more. It's not that they are not available, it's that they are presented as task-specific tools, and you often have to pay extra for them. And they are not nearly as flexible (MSExcel macros?).

Computers were not given to us to use as crutches. They were given to us to help us communicate and to help us think.

I'm not alone in my interest in retro-computing, but I think I have a little bit unusual ultimate goal in my interests.

I want to go back and re-open certain paths of exploration that the industry has lopped off as being too unprofitable (or, really, too profitable for someone else).

One is character encoding. Unicode is too complicated. Complicated is great for big companies who want to offer a product that everyone must buy. The more complicated they can make things, the harder it is for ordinary customers to find alternatives. And that is especially true if they can use patents and copyrights on the artificial complexities that they invent, to scare the customer away from trying to solve his or her own problems -- or their own corporate problems, in the case of the corporate customer.

Computer are supposed to help us solve our own problems, not to impose our own solutions on unsuspecting other people, while making them pay for the solutions that really don't solve their problems.

Now, producing something simpler than Unicode is going to be hard work, harder even than putting the original Unicode together was.

Incidentally, for all that I seem to be disparaging Unicode, the Unicode Consortium has done an admirable job, and Unicode is quite useful. They just made a conscious decision to try not to induce changes on the languages they are encoding. It's a worthy and impossible goal.

And they should keep it up. Even though it's an impossible goal, their pursuing that goal is enabling us to communicate in ways we couldn't before.

But we must begin to take the next step.

Rehashing,
  • The encoding needs to include the ability to encode a single common set of international characters/glyphs in addition to all the national encodings.
  • It needs to include
    • characters,
    • non-character numerics,
    • bitmap and vector image,
    • and other arbitrary (binary/blob) data.
  • It needs to be easily parsed with a simple, regular encoding grammar.
  • And it needs to be open-ended, allowing new words and characters to be coined on-the-fly.
Another path involves CPUs. Intel wants us all to believe that they own the pinnacle of CPU design, but, of course, that is just corporate vanity.

In the embedded world, lots of CPUs that the rest of the world has forgotten are still very much in use, because their designs are optimal in specific engineering contexts. Tradition is also influential, but there are real, tangible engineering reasons that certain non-mainstream CPUs are more effective in certain application areas. The complexity and temporal patterns of the input will favor certain interrupt architectures. The format of the data will favor specific register set constructions. Etc.

Many engineers will acknowledge the old Motorola M6809 as the most advanced 8-bit CPU ever, but it seems to have been a dead-end. ("Seems." It is still in use.) "Bits of it lived on in the 68HC12 and 68HC16." But the conventional wisdom is now that, if you need such an advanced CPU, it's cheaper to go with a low-end 32-bit ARM processor.

What got left behind was the use of a split stack.

The stack is where the CPU keeps a record of where it has been as it chases the branching trails of a problem's solution. When the CPU reaches a dead end, the stack provides an organized structure for backtracking and starting back down new branches in the trail.

Even "stackless" run-time environments tend to imitate stacks in the way they operate, because of a principle called problem context, in addition to the principle of backing out of a non-workable solution.

But the stack doesn't just track where the CPU has been. It also keeps the baggage the CPU carries with it, stuff called local (or context-local) variables. Without the data in that baggage, it does no good for the CPU to try to back up. The data is part and parcel of where it has been.

Most "modern" CPUs keep the code location records in the same memory as the context-local data. It seems more efficient, but it also means that a misbehaving program can easily lose track of both the context data and the code location at once. When that happens, there is no information to analyze about what went wrong. The machine ends up in a partially or completely undefined state.

Worse, in a hostile environment, such a partially defined state provides a chance for attacking the machine and the persistent data that it keeps on the hard disk. (Stack crashes are most effective when the state of the program has already become partially undefined.)

[JMR201706301711: I've recently written rather extensively on stack vulnerabilities and using a split stack to reduce the vulnerabilities:
]

Splitting the stack allows for more controlled recovery from error states that haven't been provided for. In the process, it reduces the surface area susceptible to attack.

The split stack also provides a more flexible run-time architecture, which can help engineers reduce errors in the code, which means fewer partially-defined states.

There are a couple of other areas in which so-called modern CPUs in use in desktop computers and portable data devices are not well matched to their target application areas, and the programming languages (and operating systems), reflecting the hardware, are likewise not well-matched. This is especially true of the sort of problems we find ourselves trying to solve, now that we think most of the easy ones have been solved.

In order to flesh out better CPU architectures, I want to build a virtual machine for the old M6809, then add some features like system/user separation, and then design an expanded address space and data register CPU following the same principles.

I'm pretty sure it will end up significantly different from the old 68K family. (The M6809 is often presented as a "little brother" to the 68K, but they were developed separately, with separate design goals. And Motorola management never really seemed to understand what they had in the 6809.)

Once I have an emulator for the new CPU, I want to develop a language that takes advantage of the CPU's features, allowing for a richer but cleaner procedural API that becomes less of a data and time bottleneck. It should also allow for a more controlled approach to multiprocessing.

And then I want to build a new operating system on what this language and CPU combination would allow, one which would allow the user to be in control of his tools, instead of the other way around.

This is what I mean when I say I am trying to re-invent the industry.

No comments:

Post a Comment