Misunderstanding Computers

Why do we insist on seeing the computer as a magic box for controlling other people?
人はどうしてコンピュータを、人を制する魔法の箱として考えたいのですか?
Why do we want so much to control others when we won't control ourselves?
どうしてそれほど、自分を制しないのに、人をコントロールしたいのですか?

Computer memory is just fancy paper, CPUs are just fancy pens with fancy erasers, and the network is just a fancy backyard fence.
コンピュータの記憶というものはただ改良した紙ですし、CPU 何て特長ある筆に特殊の消しゴムがついたものにすぎないし、ネットワークそのものは裏庭の塀が少し拡大されたものぐらいです。

(original post/元の投稿 -- defining computers site/コンピュータを定義しようのサイト)

Friday, April 21, 2017

The Problem with Full Unicode Domain Names -- apple.com vs. appІe.com

Well, this one is taking longer to boil over than I expected. I've been watching for the storm for over fifteen years, and convoluted fixes on fixes have dodged the bullet this long.

[JMR201704221101: addendum]

I should note that the primary danger comes from clicking links given you by untrusted sources. The best solution here is not to do that. Abstain. Don't click on the links.

Copy them out, look at them in a text editor using a technical font that shows differences between I, 1, and l, and between 0 and O, etc.

Plug the URL into the search field of a web search engine -- Not into the URL bar of your browser, that takes you straight there. Let the search engine tell you what it knows about the site before you go there.

Then type in the domain name part by hand. If you have the URL

http://shop.apple.com/login/username=pinkfloy&mode=longstringofstuff
the domain name part is
shop.apple.com 
(There's more that can be said, but I don't want to confuse you about controlling domains, so just type the whole domain name.)

If that's too much trouble, maybe you didn't want to go there anyway. But at least click on something the search engine shows you instead of the link in the e-mail.

[JMR201704221101: end addendum]

The problem:


Depending on your default fonts, you may be able to see a difference between the following two domain names:

apple.com vs. appІe.com
It's similar to the problem with
apple.com vs. appIe.com
but with a twist. The first one uses a Cyrillic (as in Russian) character to potentially cause confusion, where the second one keeps the trickiness all in the Latin (as in English) alphabet.

Let's look at both of those again, and I'll try to specify a font where there will be problems. First, we'll try the Ariel font (if it's on your computer):
apple.com vs. appІe.com
(Latin little "l" -- Cyrillic capital "І")
and next the Courier font (if it's on your computer):
apple.com vs. appІe.com
(Latin little "l" -- Cyrillic capital "І")
And we'll look at the Latin-only domain names, first in Arial:
apple.com vs. appIe.com
(Latin little "l" -- Latin capital "I")
and then in Courier:
apple.com vs. appIe.com
(Latin little "l" -- Latin capital "I")

Do you see what's happening?

Someone could grab the domain with the visual spoof and trick you into giving them your Apple login and password and maybe even your credit card number.

When domain names were all lower case Latin, we had fewer problems. In other words,
appIe.com 
was properly spelled
appie.com
and the browser would display it in the latter form.


Well, there was still the problem with
app1e.com
substituting the number "1" for the little "l". But the registrars tended to try to help by refusing to register confusing domain names. And browsers were careful to use fonts that would show the differences in the URL bar.

Some time ago, pretty much all Unicode language scripts became allowed in domain names. This was strongly pushed by China, where they did not want {sarcasm-alert} to have all their loyal subjects surfing the Internet in Latin. That would let everyone see how superior English is, and that would never do.{end sarcasm-alert.}

(I shouldn't be sarcastic. They do need Chinese URLs. Otherwise, there would be too many companies competing for bai.com and ma.com.)

Apparently, non-Latin scripts seem to be even allowed to use capitals. Or, at least, unscrupulous or careless registrars seem to be allowing them in some cases. I'm not sure why.

(Here's the RFC. What am I missing?)

If the Cyrillic visual spoof I am using as an example were coerced to lower case in the URL bar, here's what it would look like in the Ariel font:
apple.com vs. appіe.com
(Latin little "l" -- Cyrillic lower case "і"
That would solve a lot of problems.

If you are worried about this, one thing that can help if you are using Firefox, type
about:config
in the URL bar. (That's where URLs like
https://www.lds.org
show up, and you can type them in by hand to go there.)

You'll get a warning that tells you that the Mozilla Foundation is not going to take the blame if you use non-default settings. (They won't anyway, but don't check the box that says you don't want to be warned. And remember that you have done this.)

Use the search bar to search for
punycode
and you'll find
network.IDN_show_punycode;false
Double-click the "false" and it will turn to "true". And then URLs like
www.appІe.com
will be displayed in the status bar as URLs like
www.xn--80ak6aa92e.com/
Now, that's ugly, don't you think? Anyway, you won't be mistaking it for
www.apple.com
(This is called punycode. Hmm. Actually, the Japanese page on punycode shows what's happening a little better than the English page.)

Then again, you will be wondering what that URL means. So I don't really know if I want to recommend it.

If I were a Mozilla developer involved with this, I would take a clue from what I've done above and do it like this:
www.apple.com (all Latin)
but
www.appІe.com (Cyrillic "І")
In other words, all the characters in URLs from languages other than the browser's default language would be displayed with colored backgrounds to make them stand out. And I might even add a warning bubble or something that said,
Warning! Mixed language URL contains Cyrillic "І"!
floating over the URL. This approach would mitigate a lot, including
  • Іds.org (Cyrillic)
  • аррӏе.com (Cyrillic)
  • perl.org (zenkaku, or full-width)
and so forth.

(I thought this was in the RFCs, but I'm not seeing it. Maybe I'm remembering my own thoughts on how to mitigate this particular semantic attack.)

I have advocated improving Unicode by reconstructing the encoding and including an international character set where such visual doublings could be eliminated. And separating Chinese and Japanese language encodings, and the three different Chinese encodings from each other, as well.

Nobody seems to like the idea.

It's a lot of work.

I'd be willing to do it relatively cheap! (Relatively.)

Model Boot-up Process Description, with Some References to Logging

This is a description of a model boot-up process 
for a device that contains a CPU,
with Some References to Logging.

(This is a low-level follow-up to theses posts:
which may provide more useful information.)

This is just a rough model, a rough ideal, not a specification. Real devices will tend to vary from this model. It's just presented as a framework for discussion, and possibly as a model to refer to when documenting real hardware.



(1) Simple ALU/CPU test.

The first thing the CPU should do on restart is check the Arithmetic-Logic Unit, not in the grand sense, but in a limited sense.

Something like (assuming an 8 bit binary ALU) adding 165 to 90 and checking that the result comes out 255 (A5sixteen + 5Asixteen == FFsixteen), and then adding 1 to the result to see if the result is 0 with a carry, would be a good, quick check. This would be roughly equivalent to trying to remember what day it is when you wake up, then checking to see that you remember what the day before and the day after are.

It doesn't tell you much, but it at least tells you that your brains are trying to work.

* If the ALU appears to give the wrong result, there likely won't be much that can be done -- maybe set a diagnostic flag and halt safely.

* In some devices, halting itself is not safe, and an alternative to simply halting such as having the device securely self-destruct may be safer. Halting safely may have non-obvious meanings.

Now, it's very likely that this test can be made a part of the next step, but we need to be conscious of it.

(2) Initial boot ROM test.

There should be an initial boot ROM that brings the CPU up. The size should be in the range of 1,000 instructions to 32,768 instructions.

Ideally, I would strongly suggest that it contain a bare-metal Forth interpreter as a debugger/monitor, but it may contain some other kind of debug/monitor. It may just contain a collection of simple Basic Input-Output library functions, but I personally do not recommend that. It needs to have some ability to interact with a technician.

And, of course, it contains the machine instructions to carry out the first several steps of the boot-up process.

This second step would then be to perform a simple, non-cryptographic checksum of the initial boot ROM.

Which means that the ROM contains its own test routines. This is clearly an example of chicken-and-egg logical circularity. It is therefore not very meaningful.

This is not the time for cryptographic checksums.

* Success does not mean that the CPU is secure or safe. Failure, on the other hand, gives us another opportunity to set a diagnostic flag for a technician to find, and halt safely, whatever halting safely means.

On modern, highly integrated CPUs, this ROM is a part of the physical CPU package. It should not be re-programmable in the field.

(That's one reason it should be small -- making it small helps reduce the chance for serious bugs that can't be fixed. This smallest part of the boot process cannot be safely re-written and cannot safely be allowed to be overridden.)

For all that it should not be re-programmable in the field, the source should be available to the end-administrator, and there should be some means of verifying that the executable object in the initial boot ROM matches the source that the vendor says should be there.

(3) Internal RAM check.

Most modern CPUs will have some package internal RAM, distinct from CPU registers. It is a good idea to check these RAM locations at this point, to see that what is written out can be read back, using bit patterns that can catch short and open circuits in the RAM.

Just enough RAM should be tested to see that the initial boot-up ROM routines can run safely. If the debug/monitor is a Forth interpreter, it should have enough return stack for at least 8 levels of nested call, 16 native integers on the parameter stack, and 8 native integers of per-user variable space. That's 32 cells of RAM, or room for 32 full address words, in non-Forth terminology.

(I'm speaking roughly, more complex integrated packages will need more than that, much more in some cases. Very simple devices might actually need only half that. The engineers should be able to determine actual numbers from their spec. If they can't, they should raise a management diagnostic flag and put the project in a wait state.)

* Again, if there are errors, there is not much we can do but set a diagnostic flag and do its best to halt safely, whatever halting safely means.

(4) Lowest level diagnostic firmware.

At this point, we can be moderately confident that the debug/monitor can safely be entered, so it should be entered and initialize itself.

The next several steps should run under the control of the debug/monitor.

* Again, if the debug/monitor fails to came up in a stable state, the device should set a diagnostic flag and halt itself as safely as possible.

** This means that the debug/monitor needs a resident watchdog cycle that will operate at this level.

(5) First test/diagnostic device.

We want a low-level serial I/O (port) device of high reliability, through which the technician can read error messages and interact with the debug/monitor.

(Parallel port could work, but it would usually be a waste of I/O pins for no real gain.)

* This is the last point where we want to just set a diagnostic flag and halt as safely as possible on error. Any dangerous side-effects of having started the debug port should be addressed before halting safely at this stage.

(6) Full test of CPU internal devices.

This step can be performed somewhat in parallel with the next step. Details are determined by the internal devices and the interface devices. Conceptually, however, this is a separate step.

All internal registers should be tested to the extent that it is safe to test them without starting external devices. This includes being able to write and read any segment base and limit/extent registers, but not does not actually include testing their functionality.

If the CPU provides automatic testing, this is probably the stage where it should be performed (which may require suspending or shutting down, then restarting the monitor/debug processes).

Watchdog timers should be checked to the extent possible and started during this step.

If there is internal low-level ROM that remains to be tested, or if management requires cryptographic checksum checks on the initial boot ROM, this is the stage to do those.

Note that the keys used here are not, repeat, not the manufacturer's update keys. Those are separate.

However, for all that management might require cryptographic self-checks at this stage, engineers should consider such checks to be exercising the CPU and looking for broken hardware, and not related to security. There should be a manufacturer's boot key, and the checksums should be performed with the manufacturer's boot key, since the initial boot ROM is the manufacturer's code.

How to hide the manufacturer's boot key should be specified in the design, but, if the test port enabled in step (5) allows technician input at this step, such efforts to hide the manufacturer's key can't really prevent attack, only discourage attack.

Even if the device has a proper system/user separation, the device is in system state right now, and the key has to be readable to be used.

The key could be encrypted and hidden, spread out in odd corners of the ROM. There could be two routines to read it, and the one generally accessible through the test port could be protected by security switch/strap and/or extra password. But the supervisor, by definition, allows the contents of ROM to be read and dumped through the test port at this stage. A determined engineer would be able to analyze the code and find the internal routine, and jump to it. Therefore, this raises the bar, but does not prevent access.

Another approach to raising the bar is the provision of a boundary between system/supervisor mode and key-access mode. The supervisor could use hardware to protect the key except when in key-access mode, and could use software to shut down the test port when key-access mode is entered. This would make it much more difficult to get access to supervisor commands while the key is readable, but there are probably going to be errors in the construction that allow some windows of opportunity. It is not guaranteed that every design will be able to close off all windows of opportunity.

Such efforts to protect the boot key may be useful. They do raise the bar. But they do not really protect the boot key, only discourage access.

And legal proscriptions such as that epitome of legal irony called DMCA do not prevent people who ignore the law from getting over the bar.

Thus, the key used to checksum the initial boot ROM must not be assumed to be unknown to attackers. (And, really, we really don't need to assume it is unknown, if we don't believe in fairy tales about protecting intellectual property at a distance. As long as this initial boot ROM can't be re-written. As long as the update keys are separate.)


The extra ROM, if it exists, should not be loaded yet, only tested.

If extra RAM is required to do the checksums, the RAM should be checked first, enough to perform the checksums

All remaining internal RAM should be checked at this stage.

(7) Low-level I/O subsystems.

Finally, the CPU package is ready to check its own fundamental address decode, data and address buffers, and so forth. Not regular I/O devices, but the devices that give it access to low-level flash ROM, cache, working RAM, and the I/O space, in that order.

They should be powered up and given rudimentary tests.

Note that the flash ROM, cache, working RAM, and I/O devices themselves should not yet be powered up, much less tested.

Only the interfaces are powered up and tested at this step, and they must be powered up in a state that keeps the devices on the other side powered down.

* On errors here, any devices enabled to the point of error should be powered down in whatever order is safe (often in reverse order of power-up), diagnostic messages should be sent through the diagnostic port, and the device should set a diagnostic flag and enter as safe a wait state as possible.

** It may be desirable to enter a loop that repeats the diagnostic messages.

It would seem to be desirable to provide some way for a technician to interrogate the device for active diagnostic messages.

** But security will usually demand that input on the diagnostic port be shut down unless a protected hardware switch or strap for this function is set to the enabled position/state. This is one of several such security switch/straps, and the diagnostic message will reflect the straps state to some extent.

This kind of security switch or strap is not perfect protection, but it is often sufficient, and is usually better than nothing. (Better than nothing if all involved understand it is not perfect, anyway.)

** In some cases, the security switch/straps should not exist at all, and attempts to find or force them should be met with the device's self-destruction. In other cases, lock and key are sufficient. In yet other cases, such as in-home appliance controllers, a screw panel may be sufficient, and the desired level of protection.

Straps are generally preferred to switches, to discourage uninformed users from playing with them.

*** However, attempts to protect the device from access by the device's legal owner or lawfully designated system administrator should always be considered highly suspect, and require a much higher level of engineering safety assurance. If the owner/end-admin user must be prevented from maintenance access, it should be assumed that the device simply cannot be maintained -- thus, quite possibly should self-destruct on failure.

(8) Supervisor, extended ROM, internal parameter store.

The initial boot ROM may actually be the bottom of a larger boot ROM, or there may be a separate boot ROM containing more program functions, such as low-level supervisor functions, to be loaded and used during initial boot up. This additional ROM firmware, if it exists, should be constructed to extend, but not replace the functionality in the initial boot ROM.

This extra initial boot ROM was tested in step (6), it should be possible to begin loading and executing things from it now. It would contain the extensions in stepped modules, starting modules necessary to support the bootstrap process as it proceeds.

Considering the early (classic) Macintosh, a megabyte of ROM should be able to provide a significant level of GUI interface for the supervisor, giving end-admins with primarily visual orientation improved ability to handle low-level administration. But we don't have display output at this point, such functionality should be oriented toward the technician's serial port at this stage.

This supervisor would also contain the basic input/output functionality, so it could be called, really, a true "Basic Input/Output Operating System" -- BIOOS. But that would be confusing, so let's not do that. Let's just call it a supervisor.

It could also contain "advanced" hooks and virtual OS support such as a "hypervisor" has, but we won't give in to the temptation to hype it. It's just a supervisor. And most of it will not be running yet.

This remaining initial boot ROM is not an extension boot ROM such as I describe below, but considered part of the initial boot ROM.

There should be internal persistent store that is separate from the extension boot (flash) ROM, to keep track of boot parameters such as the owner's cryptographic keys and the manufacturer's update cryptographic keys for checksumming the extension flash ROM, passwords, high-level boot device specification, etc. It should all be write protected under normal operation. The part containing the true cryptographic keys for the device and such must be both read- and write-protected under normal operation, preferably requiring a security switch/strap to enable write access.

Techniques for protecting these keys have been partially discussed above. The difference is that these are the owner's keys and update keys, and those are the manufacturer's boot keys.


This parameter store should be tested and brought up at this point.

Details such as how to protect it, how to enable access, and what to do on errors are determined by the engineers' design specification.

In the extreme analysis, physical access to a device means that anything it contains can be read and used. The engineering problem is the question of what kinds of cryptological attacks are expected, and how much effort should be expended to defend the device from unauthorized access.

Sales literature and such should never attempt to hide this fact, only assert the level to which they are attempting to raise the bar.

Again, attempts to protect the device from access by the legitimate owner/end-admin should be considered detrimental to the security of the device.

* At this point, reading the owner's keys and update keys from the test port should be protected by security switch/strap and password. But, again, until the boot process has proceeded far enough to be able to switch between system and user mode, the protections have to be assumed to be imperfect.

Providing a key-access mode such as described above for the manufacturer's key should mitigate the dangers and raise the bar to something reasonable for some applications, but not for all.

Some existing applications really should never be produced and sold as products.

(As an example, consider the "portable digital purse" in many cell phones. That is an abomination. Separated from the cell phone, it might be workable, but only with specially designed integrated packages, and only if the bank always keeps a copy of the real data. Full discussion of that is well beyond the scope of this rant.)

(9) Private cache.

If there is private cache RAM local to the first boot CPU, separate from the internal RAM, it should be tested now. Or it could be schedule and set to run mostly in a lesser privileged mode after lesser privileged modes are available.

If there are segment base and limit/extent registers, their functionality may be testable against the local cache.

In particular, if the stack register(s) have segment base and limit, and can be pointed into cache, it might be possible to test them and initialize the stacks into such cache here, providing some early stack separation.

If dedicated stack caches are provided in the hardware, they should be tested here. If they can be used in locked mode (no spills, deep enough), the supervisor should switch to them now.

* Errors at this point will be treated similarly to errors in step 7.

(10) Exit low-level boot and enter intermediate level boot process.

At this point, all resources owned by the boot-up CPU should have been tested.

Also, at this point, much of the work can and should be done in less secure modes of operation. The less time spent in system/supervisor mode, the better.

(10.1) Testing other CPUs.

If there are multiple CPUs, this is the step where they should be tested. The approach to testing the CPUs depends on their design, whether they share initial boot ROMs or are under management of the initial boot CPUs, etc.

From a functional point of view, it is useful if the first boot CPU can check the initial boot ROMs of the other CPUs before powering them up, if those ROMs are not shared. It may also be useful for the first boot CPU to initiate internal test routines on the others, and monitor their states as they complete.

At any rate, as much as possible should be done in parallel here, but care should be exercised to avoid one CPU invalidating the results of another.

* Again, errors at this point will be treated similarly to errors in step 7.

(10.2) Testing shared memory management hardware access, if it exists.

While waiting for the other CPUs to come up, any true memory management hardware should be tested and partially initialized.

At this point, only writing and reading registers should be tested, and enough initialization to allow un-mapped access.

* Again, errors at this point will be treated similarly to errors in step 7. MMU is pretty much vital, if it exists.

(10.3) Finding and testing shared RAM.

Shared main RAM should be searched for before shared cache.

As other CPUs come up, they can be allocated to test shared main RAM. (Really, modern designs should go to multiple CPUs before going to larger address space or faster CPUs, any more.) If there are multiple CPUs, testing RAM should be delegated to CPUs other than the first boot CPU.

This also gets tangled up in testing MMU.

Tests should be run first without address translation, then spot-checked with address translation.

As soon as enough good RAM has been found to support the return address stack and local variable store (one stack in the common case now, but preferably two in the future, a thread heap and a process heap) the supervisor OS, to the extent it exists, should be started now if it has not already been started. (See next step.)

Otherwise, parallel checks on RAM should proceed without OS support.

Either way, the boot ROM should support checking RAM in the background as long as the device is operational. RAM which is currently allocated would be left alone, and RAM which is not currently allocated would have test patterns written to them and read, helping erase data that programs leave behind.

Such concurrent RAM testing would be provided in the supervisor in the initial boot up ROM, but should run in a privilege-reduced state (user mode instead of system/supervisor).

* Usually, errors in RAM can be treated by slowing physical banks down until they work without errors, or by mapping physical banks out. Again, a log of such errors must be kept, and any errors in RAM should initiate a RAM checking process that will continue in the background as long as the device is running.

** If there are too many errors at this point, they may be treated similarly to errors in step 7.

*** Any logs kept in local RAM should be transferred to main RAM once enough main RAM is available (and known good).

(10.4) Testing shared cache.

As other CPUs come up, they can also be allocating to testing shared cache. As with testing main RAM, testing cache should be delegated to CPUs other than the first boot CPU. Also, main RAM comes before cache until there is enough known-good RAM to properly support multiple supervisor processes.

And this also gets tangled up in testing MMU.

Tests should be run first without being assigned to RAM, then again with RAM assigned.

* If there are errors in the cache, it might be okay to disable or partially disable the cache. Engineers must make such decisions.

** Errors at this point errors may still be treated similarly to errors in step 7, depending on engineering decisions. If it is acceptable to run with limited cache, or without cache, some logging mechanism that details the availability of cache must be set up. Such logging would be temporarily kept in internal RAM.

*** The decision about when to enable cache is something of an engineering decision, but, in many cases, once cache is known to be functional, and main RAM has also been verified, the cache can be put into operation.

In some designs, caches should not be assigned to RAM that is still being tested.

(11) Fully operational supervisor.

At this point, most of the remaining functionality of the supervisor (other than GUI and other high-level I/O) should be made available. Multi-tasking and multi-processing would both be supported (started in the previous step), with process management and memory allocation.

One additional function may become available at this point -- extending the supervisor via ROM or flash ROM.

If there is an extension ROM, the initial boot ROM knows where it is. If it is supposed to exist, the checksum should be calculated and confirmed at this point.

The key to use depends on whether the extension has been provided by the manufacturer or the end-user/owner. Manufacturer's updates should be checked with the update key (not the boot key), and owner's extensions should be checked with the owner's key.

Failure would result in a state such as in step (7).

Testing the extension proceeds as follows:

There are at least two banks of flash ROM. In the two bank configuration, one is a shadow bank and the other is an operational bank.

If the checksum of the operational bank is the same as the unwritable extension ROM, the contents are compared. If they are different, the operational bank is not loaded, and the error is logged and potentially displayed on console.

If the checksum of the operational bank is different from the unwritable ROM, it is checked against the shadow bank. If the shadow bank and the operational bank have the same checksum, the contents of the two are compared. If the contents are different, the operational bank is not loaded and the error is logged and potentially displayed on console.

If the contents are identical, the cryptographic checksum is checked for validity. If it is not valid, the operational bank is not loaded, and the error is logged and potentially displayed on console.

* If the operational bank verifies, it is loaded and boot proceeds.

** If the operational bank fails to verify, a flag in the boot parameters determines whether to continue or to drop into to a maintenance mode.

If the device drops into a maintenance mode, the test port becomes active, and a request for admin password is sent out it. A flag is set, and boot proceeds in a safe mode, to bring up I/O devices safely.

(When the operational bank is updated, the checksum checked and verified, and committed, the operational bank is copied directly onto the shadow bank. But that discussion is not part of this rant.)


Other approaches can be taken to maintain a valid supervisor. For instance, two shadow copies can be kept to avoid having to restore the factory extensions and go through the update process again from scratch.

The extensions can override much of the initial boot ROM, but the monitor/debugger must never be overridden. It can be extended in some ways, but it must not be overridden.

There should be no way to write to this flash ROM except by setting another protected hardware switch or strap which physically controls the write-protect circuit for the flash. This switch or strap should not be the same as mentioned in step (7), but may be physically adjacent to it, depending on the engineers' assessment of threat.

*** The initial boot ROM should not proceed to the flash ROM extensions unless said switches or straps are unset.

(12) I/O devices.

(12.1) Locating and testing normal I/O device controllers.

As known good main RAM becomes available, the boot process can shift to locating the controllers for normal I/O devices such as network controllers, rotating disk controllers, flash RAM controllers, keyboards, printers, etc.

There may be some priority to be observed when testing normal I/O device controllers, as to which to initiate first.

It also may be possible to initiate controller self-tests or allocate another CPU to test the controllers, so that locating the controllers and testing them can be done somewhat it parallel.

Timers and other such hardware resources would be more fully enabled at this point.

* Errors for most controllers should be logged, and should not cause the processor to halt. 

(12.2) Identifying and testing devices.

As controllers become available and known good, the devices attached to them should be identified, initialized, and tested.

This might also occur in parallel with finding and testing other controllers.

* Errors for most devices should be logged, and should not cause the processor to halt. 

** Some intelligence about the form and number of logs taken at this point can and should be exercised. We don't want RAM filled with messages that, for example, the network is unavailable. One message showing when problems began, and a count of error events, with a record of the last error, should be sufficient for most such errors.

(12.3) Low-level boot logging.

As video output and persistent store become available, error events should be displayed on screen and recorded in an error message partition. Again, there should be a strategy to avoid filling the error message partition, and to allow as many error notifications as possible to remain on screen.

If the device is booting to maintenance mode, and an admin has not logged in via the test port at this point, the video device may present a console login prompt/window, as well.  Or it may present one for other reasons, such as a console request from the keyboard.

The video display could also have scrolling windows showing current system logs.

Also, parameter RAM flags may prevent console login to a local video device/keyboard pair, requiring admin login at the test port via some serial terminal device.

(12.4) High-level boot.

The supervisor would have hooks and APIs to present walled virtual run-time sessions to high-level OSses, including walled instances of itself and walled instances of Linux or BSD OSses, or Plan Nine, etc., to the extent the CPU can support such things, and to the extent the device is designed to support such things.

And parameter RAM would have flags to indicate whether a boot menu should be provided, or which high-level OSses available should boot.

If walled instances are not supported, only a single high level OS would be allowed to boot, and the supervisor would still map system calls from the high-level OS into device resources.



This is my idea of what should happen in the boot-up process. Unfortunately, most computers I am familiar with do a lot of other stuff and not enough of this.

Friday, April 14, 2017

how a proper init process should work

Should not be using valuable time to post this today.

This post
https://lists.debian.org/debian-user/2017/04/msg00441.html
to the debian users mail list, and my response to it:
https://lists.debian.org/debian-user/2017/04/msg00443.html
reminded me of this blog post
http://reiisi.blogspot.jp/2014/10/thinking-about-ideal-operating-system.html
that I wrote back in 2014.

I think I want to post the meat of that here, without the complaints about systemd, and with a little explanation.

(If someone were to implement these ideas, it would require a fork of the OS distribution to keep things sane, of course. It would likely also require a fork of the kernel.)

Now, if I were designing an ideal process structure for an operating system, here's what I would do:

Process id 1, the parent to the whole thing outside the kernel, would

  • Call the watchdog process startup routine.
    • It (process id 1) would not itself be the watchdog process, because pid 1 has to be simple.
    • The process watchdog routine would probably need some special mutant process capabilities.
    • The changes may ripple back to the kernel.
    • But such changes would be better than adding complexity to pid 1 to directly handle being the watchdog of everything.
  • Call the interrupt manager process startup routine.
    This may be obvious, but you do not want pid 1 managing interrupts directly. That would provide too many opportunities for pid 1 to go into some panic state.
  • Call the process resource recycling process startup routine.
    Recycling process resources must be kept separate from catching dying processes.
    • You don't want an orphaned process to have to wait for some other process to have it's resources recycled, just to be seen.
    • And you don't want that complexity in pid 1.
  • Call the general process manager process startup routine.
    • This is the process which interacts with ordinary system and user processes. 
    • Most orphan processes would be passed to the process recycling process by this process.
  • Enter a loop in which it 
    • monitors these four processes and keeps them running, restarting them when necessary --
      (Ergo, who watches the watchdogs, erm, watchers? -- Uses status and maintenance routines for each.);
    • collects ultimately orphaned processes and passes their process records to the resource recycler process;
    • and checks whether the system is being taken down, exiting the loop if it is.
  • Call the general process manager process shutdown routine.
  • Call the process resource recycling process shutdown routine.
  • Call the interrupt manager process shutdown routine.
  • Call the watchdog process shutdown routine.
  • Call the last power supply shutdown driver if not a restart.
    (Note that this defines an implicit loop if it is not a full shutdown.
All other processes, including daemons, would be managed by the process manager process.

This would break a lot of things, especially a lot of things that interact directly with the pid 1 process (e. g., hard-coded to talk to pid 1).

Traditional init systems manage ordinary processes directly. I'm pretty sure it's more robust to have them managed separately from pid 1. Thus the separate process manager process.

There are a few more possible candidates for being managed directly by pid 1, but you really don't want anything managed directly by pid 1 that doesn't absolutely have to be. Possible candidates, that I need to look at more closely:
  • A process/daemon that I call the rollcall daemon, that tracks daemon states and dependencies at startup and during runtime. 
  • A process/daemon to manage scalar interprocess communication -- signals, semaphors, and counting monitors.
    (But non-scalar interprocess communication managers definitely should not be handled directly by pid 1.)
  • A special support process for certain kinds of device managers that have to work closely with hardware.
  • A special support process for maintaining system resource checksums.
Processes I consider ordinary processes to be managed by the general process manager, instead of pid 1, are basically everything else, including
  • Socket, pipe, and other non-scalar interprocess communication managers,
  • Error Logging system managers,
  • Authentication and login managers,
  • Most device manager processes (including the worker processes supported by the special support process I might have the pid 1 process manage),
  • The actual processes checking and maintaining system resource checksums,
  • Etc.
Definitely, code to deal with SELinux, Access Control Lists, cgroups, and that kind of fluff, should be managed by ordinary processes managed by the general process manager.

For me, this is daydreaming. I don't have the job, the cred, or the network that could put me in a position where would I have the time or resources to code it up.

I would love to, if I can find someone to front me about JPY 400,000 (USD 4,000.00) a month for about a year's work, plus the hardware to code and debug on (between about JPY 500,000 or USD 5,000.00 and double that).

[JMR201704211138: I had some thoughts on the low-level boot process, which might be interesting: http://defining-computers.blogspot.com/2017/04/model-boot-up-process-description-with.html.]

Wednesday, March 29, 2017

Google is Out of Control (More Big Is Bad)

I suppose I shouldn't complain about stuff I get for free.

I have several blogs on blogger/blogspot, which is part of Google. I've been using two of my blogs to try to publish, or broadcast intermediate drafts of a novel that I am writing. Mostly it's like a typewriter, but it gets slow sometimes because Google's formatted editing functions in javascript just get really heavy.

Sometimes I have to do some detail work and go into HTML mode.

A couple of years ago, I ran into a bug. Their editor monitors your HTML, and if it finds an unbalanced tag, or other incorrectly written HTML, it lets you know.

This should be a good thing, right?

The problem is where it lets you know. Here are two pictures, see if you can see what is really, really bad UI design here:






Here is what it looks like after I fight the HTML checking and fix the tag:


Let's see if I can put these side-by-side:











(Seems to only allow side-by-side if I make them really small.)

You can see, perhaps, how the entire editing area gets shoved down in the window while the error message is showing.

I don't know if it does this on all window managers, but on this one, here's what it means.

They apparently try to wait a while, to give you time to fix the tag you are working on.

So, you erase the opening tag of something you really wish the editor had not been so kind as to insert in there for you, and then you go searching for the closing tag. And you try to select it. And, just as yo are trying to select it, the error message pops up, and the whole thing moves down, and you select you know not what.

I could have organized my work a bit better so the next is easier to see, but here's the kind of thing that happens:


I had erased the start <div> tag and was going after the end tag. This one was fairly tame, but you can see that I end up with important lines before the tag selected. If I hit the delete key too fast, I have to hope the undo function works. (Eventually. It can be really slow.)

UIs should not start jumping around to show error messages, especially when the thing is written in javascript and is slow and subject to various fits and tempers anyway.

This kind of hand-holding by a temperamental nanny just is bad UI any way you look at it.

If they want to hold my hand while I work in HTML, can they at least give me a nice button to turn the hand-holding off until I'm finished with the delicate surgery?

I hit the feedback button on this problem about three years ago, IIRC, and their response was pretty good. Then, sometime back, the inappropriate behavior was back. This time, when I hit the feedback button, the lousy thing is too heavy to let me actually send them a screen shot and text. It basically leaves Firefox effectively frozen while it goes parsing through I know not what at the speed of javascript. I literally waited for an hour the first time I tried that, and had to force shutdown to get my machine back.

Google is too big to find a place to send this kind of feedback outside the defined channels. And the defined channels prevents the feedback. That's bad organizational design.

But that's what happens when a company (or a government, etc.) gets too big.

More big is more bad.

Tuesday, February 28, 2017

"Chip" Technology and Emperors with No Clothes

My niece tells me she has had her credit card number stolen again.

There are two reasons why chip technologies are just bad ideas, and I'll take this from the weaker reason to the stronger reason.

The weaker reason is in the encryption and in the CPUs that can be used in a device the size and shape of a credit card. For real security, you need at least an order of magnitude more total computing capability than you can fit in a credit card with any technology that can be specified today and physically implemented within the next ten years.

At least one order of magnitude, given known vulnerabilities in the encryption schemes.

(If I say more than that, some idiot will tell you I have to be speculating about vulnerabilities that will never be discovered, and I don't want to argue with idiots, so I'll keep it at a level that you can verify for yourself, if you are willing.)

There simply is not enough computing power available in a credit card.

You can't make the processors fast enough without the CPU burning a hole in the card.

And you couldn't power such a fast processor with a battery that would safely fit in a card.

Barring some serious advances through fundamental technology changes, the barriers in the actual physics of the devices basically prevent it from ever working. Current semiconductors just can't do it.

The stronger reason is the one about the no clothes. Some would say this is not a technological barrier, but it's still a barrier.

You cannot observe what happens in the transaction without special devices.

This is the fundamental reason internet voting is a bad idea, and that electronics do not belong in the voting booth.

It is the whole reason that the chips in your passport are a basic invasion of your privacy.

You can't see what is happening.

You don't believe me? Prove that you can see what is in your passport or credit card.

Prove that you can observe it when the credit card gives your signature to the card reader.

If you can, you have access to a certain class of device that ordinary people do not, and that can be regulated so that ordinary people could not legally have access to them.

We could imagine a world in which everyone has the equivalent of Google's glasses surgically implanted in their eyeballs, and the equivalent of a wardriver's rig embedded in their skulls, and that would allow people who cared to take the trouble to make the transactions visible to themselves.

But they would still be relying on electronics and software that someone else engineered. No single person could build the complete package all by himself, and therefore he would have to rely on someone else who could betray that trust.

("... build it all by herself"? Pardon me if I chuckle. Women regularly build such complex devices, but we call those devices "babies", and we do not claim that women understand all the technical ins and outs of those babies. And it's the understanding that is the problem and the answer here. Man or woman, it requires trusting someone whom you had better not be thinking is your God.

Babies, on the other hand, were designed by the only entity/process we dare rely on so implicitly. Dare? Well, we have no choice but to rely on that entity/process, whatever we call It/Him/Her. That is, to the extent we choose to not rely on It/Him/Her, we only destroy ourselves.

And that It/Him/Her seems really not to care much about our ability to claim we own a lot of something that nominally represents value in an artificial economic system.)

Actually, this is not a technological barrier. It's a barrier that goes a little beyond technology.

[JMR201702281021:

Mind you, I'm not saying we shouldn't use the chips. (Except in passports. That was a no-no.)

We just shouldn't advertise them for something they are not.

They are just a very unsecure addition to a financial transaction system that was fundamentally unsecure from the outset.

Here's how we should advertise them:

Credit Cards 
a very convenient way to allow someone to take your money!

and

Chip Cards 
an even more convenient way to allow someone 
to even more conveniently take your money!

I'm not calling for panic here.

Money is not all that valuable, anyway.

]

Friday, December 30, 2016

How do you remember passwords and PINs (and passphrases)?

This thread in the Ubuntu user list:


https://lists.ubuntu.com/archives/ubuntu-users/2016-December/288591.html

that inspired this blog post of mine:

http://defining-computers.blogspot.com/2016/12/passwords-passphrases-public-key-and.html

inspired another blog post.

I've talked about this before in this blog:

http://defining-computers.blogspot.jp/2012/01/good-password-bad-password.html mentions it in trying to lay out some basic approaches to choosing good passwords and remembering them.

http://defining-computers.blogspot.jp/2016/08/multiple-login-methods-for-one-user.html goes on at some length about memorability, while trying to explain why you really don't want to log in the same way every time you use a computer or other information device. (And maybe ending up missing the point for the general case.)

http://defining-computers.blogspot.jp/2015/10/why-personal-information-in-e-mail-is.html mentions the problems, and tries (not very successfully) to talk about a few ways to send a PIN to someone else without damaging your finances.

But I decided I'd lay out some processes for remembering things here.

First, you may read somewhere that experts strongly advise against writing PINs and passwords down. Your bank is probably required by the insurance company to say that.

This is a prime example of simple rules just simply not being enough. If you don't understand why you shouldn't write it down, read the post about sending PINs that I linked above.

If you don't understand why you need to write it down somewhere safe, read the other two posts, then think about why it's easy to remember a friend's phone number:
  • Regular use, 
  • Positive re-inforcement, 
  • The fact that three mistakes in a row doesn't send you back to the bank to get the number changed, 
  • Etc. 
  • And -- drumroll, please -- you write that phone number down anyway. The first several times you use it, you are reading it out of your phone book.
There's another little thing here that the guys who designed the PIN concept conveniently ignored:

Short numbers are easy to remember. They are also easy to misremember.

It's actually easier to remember a seven-digit or twelve-digit number that you regularly use than a four-digit or less number.

Why?

Because you use so many of them, for one thing. (There are several other important reasons, but that should be enough for the purposes of this post.)

Conclusion?

Banks really should start putting on-screen keyboards in the ATMs and allowing letters with the numbers, and longer passcodes instead of PINs. (And still keep the ATM passcode separate from the on-line banking login.)

But that's too easy, so banks are trying to get you to use hardware tokens, which introduce the manufacturer of the token as a new attack vector. Which I have blogged about somewhere, also. I mentioned it in passing in a general post about computer security. I think I went into more detail somewhere, but I don't remember where right now. I'm getting sidetracked again. :)

Back to remembering these tokens.

Lots of people used to write passwords down on sticky notes and attach them to the screen of the workstation or terminal where they used them. (You never did, right? :-/) It's hard to imagine they didn't understand the blatant irony in doing so, but they didn't want to believe they couldn't trust the people around them.

Trust is an important concept here. I've tried to approach it in my freedom is not free blog and other places. It's pretty much an undercurrent in most of the posts in this blog. I think I talk about it a bit in the security rants I posted in February 2013. I talked about it in my rant about trust certificates and certificate authorities and my rant about entrepreneurship, trust, and crowdfunding in my main blog. I need to post something dealing more directly with it here or in the freedom--is-not-free blog, sometime.

People have an innate need to trust.

My sister had a friend who wanted, very much, to tell my sister her PIN and let her borrow her ATM card. I shouldn't mention this except that it illustrates that trust is something we tend to develop blind spots about. That friend knew better, she just also needed someone she could trust. Which means that my sister's telling her to keep it to herself was the only right thing to do -- for all reasons.

We are all irrational about trust, because not trusting is too hard. But passwords and other cryptographic tokens are important. We can't leave our bank PIN stuck to our ATM card in our purse or backpack.

No passwords on stickies, or even any scrap of paper that you will not be controlling until you destroy it. No passwords in your daily planner or phone book.

No book of passwords unless it is under lock and key whenever it is not in your hand or in your pocket.

Lately, we are beginning to see password managers for your computers and for your portable information device (cell phone, etc.) They are convenient. They are sold as the software equivalent of a little book of passwords in a locked drawer.

There are two problems with password managers.

One, they are made by someone you don't know that you can trust. (Or, to put it in a positive phrase, if you want to use a password manager app, get to know the author and make sure the author is someone you can trust. Make sure the author has more to lose by a hidden breach of trust than he or she has to gain, etc.)

You might build your own, and I might try to show how on my programming fun blog, sometime, but that is definitely not a small project.

The other is that they can become a single point of failure. Someone gets in to your password manager, and they have access to every account or whatever that the password manager manages.

But whether you trust the author or build your own, you definitely need a really strong password that you can remember for the manager.

So. Let's talk about remembering these cryptographic tokens without a password manager app.

We'll start with clues for PINs.


Remembering a PIN is more a question of choosing a PIN you can remember. I've tried a variety of things, things like the following list:
  • The address of a girl I had a terrible crush on in junior high school, reversed, of course,
  • Two consecutive two-digit prime numbers that I had a certain mathematical interest in,
  • 10,000 minus the model number of a bicycle that I no longer have,
  • The next four-digit prime number after that model number,
  • The ascii codes for my favorite cousin's initials,
  • My dog's name, as I would punch it in on my cell phone,
  • The sum of George Washington and Abraham Lincoln's birthdays.
I had trouble remembering those, can you believe? And they still aren't all that safe, especially now that I've listed them up there like that.

If I had a note in my pocket planner:
First Tokyo Bank -- yellow bike 10's comp.
you might easily guess that PIN and know which bank it might work on.

So, I might write that I took 10's complement, but reverse the order instead. Or note it as the combination to my sports club locker instead of writing the bank's name.

So I list it with some fake stuff, like a fake mail box at work, the name of a bank I don't have an account with, etc. And mix it with some real numbers that I don't care if people know:
  • George's cell phone -- 313-843-2112
  • Second Bank of the Northeast -- that beauty's initials
  • Sports club combo -- yellow bike 10's comp.
  • Work phone extension -- 6732
  • Book club membership -- zerostone and the moon

So an important key is to be devious. Write something misleading, along with something that helps you remember the number.

But then you have to remember how it was different from what you wrote. If you use the PIN regularly, it will be easier to remember, of course. But after a month or two of not using the ATM, not so easy.

So, here's another suggestion -- use the ATM once or twice a month. Stop by and check your balance, even if you already know what it is.

Okay. So far, so good. But, you're worried that the yellow racer 2626 is too obvious, and, after trying 8484 and 7374, the attacker might try 6262, which is what you set it to.

Ciphers are good for making passwords, too. Let's try a few more simple rotation and arithmetic ciphers:
  • rot 1: 3737
    2 + 1 = 3,
    6 + 1 = 7
  • rot 9: 1515
    2 + 9 = 11 -- drop the carry: 1,
    6 + 9 + 15 -- drop the carry: 5
  • rot 1234: 3850
    2 + 1 = 3,
    6 + 2 = 8,
    2 + 3 = 5,
    6 + 4 = 10 -- drop the carry
  • add 9876: 2502
    2626 + 9876 = 1,2502 -- keep the carries except the last
  • sub 5432: 2806
    2626 - 5432 = -2806 -- ignore the negative sign
Of course, if February 25th is your son's birthday, or June 28th is your daughter's, 2502 or 2806 would be out. Choose a different cipher this time.

Saying this once again, what we are doing here is using the cipher for the PIN instead of encrypting the number we remember, because we can write the number we remember in words, but writing the cipher in words is not so easy, and may draw attention to what we are doing.

So, you write the note about the PIN on one page, and on another page, you write a note about what you did to it, but it has to be someplace that doesn't stick out. Maybe it would be in a pocket address book, instead. Or on a slip of paper in your wallet (which you could rip to shreds and throw away in multiple trash cans, once you are confident you'll remember it).
Owe 12.34 to George
might mean that you subtracted 1234 from the model number to get the PIN you are currently using.

Going to all this work, you still need to keep that pocket planner where people won't start thumbing through it. Don't leave it lying around in the office or on the living room sofa. If you keep it in your bag, make sure your bag is not where co-workers, customers, students, or random strangers might pick it up and rifle through it and walk away with it because it looks interesting.

Walking away: Keep a copy of the actual PINs in a locked drawer somewhere or maybe an extra copy of the obfuscation. Then if your pocket planner walks away, you have the PINs when you go to the bank to change them.

And you can mark out the ones you remember as you remember them, or rip the page out and rip it up, throwing the pieces away in different trash cans.

Again, the problem with being devious is that someone else might decide to be devious the same way.

Now let's get some ideas for passwords.


Simple ciphers like I've described above are not good against someone with a computer. On the other hand, a simple cipher on your password is better than keeping the bare naked password in your little book of passwords, as long as you protect your little book of passwords carefully.

The discussion of PINs lays the groundwork and motivation, but passwords and passphrases have more than numbers. There are two ways to deal with that. We can cycle through the numbers, letters, and punctuation separately, or all together. (And there are some variations, which I will leave to you to figure out.)

We've looked at cycling through the numbers, but it may be easier to understand if we put a list of the numbers, letters, and punctuation in front of us. First, let's just list the whole "printable" ascii chart:
 32:  !"#$%&'()*+,-./
 48: 0123456789:;<=>?
 64: @ABCDEFGHIJKLMNO
 80: PQRSTUVWXYZ[\]^_
 96: `abcdefghijklmno
112: pqrstuvwxyz{|}
Notice the space character at code point 32.

But the code point ranks might get in the way. Let's look at that without them:
 !"#$%&'()*+,-./
0123456789:;<=>?
@ABCDEFGHIJKLMNO
PQRSTUVWXYZ[\]^_
`abcdefghijklmno
pqrstuvwxyz{|}
Using this chart, a rot 1 cipher just uses the next character in the list. And if the character you are encrypting is "}", wrap around to the first character.

Which gives us a question, do we encrypt blanks or not? With pass phrases, it's an important question. Here we will encrypt blanks because I don't think it's a good idea to leave too many clues.

Let's pick a nice phrase to start with:
Bob's your uncle.
That's too common, let's make it a little less common:
Bobbette's your aunt.
Well, okay that isn't really that much less common. If I were doing this for real, I'd probably want to muck things up a bit further, like this:
Bobbin's my aunt Friday.
Okay, now we l33t-$pe@k it:
80bBIn'5 m4 ant phRYd@y.
And, because our server doesn't allow space, we'll remove the spaces:
80bBIn'5m4antphRYd@y.
That's a twenty-one character password. But you can remember it because of the phrase it's based on. But you can't remember it all the time, so let's encrypt it with rot 2 before we write it down. Bring that clot of ascii back:
 !"#$%&'()*+,-./
0123456789:;<=>?
@ABCDEFGHIJKLMNO
PQRSTUVWXYZ[\]^_
`abcdefghijklmno
pqrstuvwxyz{|}
and refer to it while we work. Slide everything over two characters:
80bBIn'5m4antphRYd@y.  (original)
:2dDKp)7o6cpvrjT[fB{0  (encrypted, if I made no mistakes)
The only problem with this is that if
Google mail: :2dDKp)7o6cpvrjT[fB{0
is in your pocket planner, the attacker is going to think,
That's not plaintext. No way is someone who makes up a password like this going to just leave it there for me to grab.
And she will proceed to do the reverse of what we just did, trying with 1 and minus 1, then 2 and -2:
91cCJo(6n5bouqiSZeAz/ (rot -1)
;3eELq*8p7dqwskU\gC|1 (rot 1)
80bBIn'5m4antphRYd@y. (rot -2)
<4fFMr+9q8erxtlV]hD}2 (rot 2)
and, because she's interested in l33t$pe@K, too, she notices that rot -2 gives something that looks suspiciously meaningful. And she thinks,
"Bobbin's my aunt, uhm Friday." Could be a mnemonic password.
And, not only does she have your password, but if you used rot 2 anywhere else, she has those passwords, too.

Okay, so the encryption probably discourages the average junior high school student, but it only attracts the attention of the determined attacker with a bit of free time.

We could use an alternating rotation and still be within range of what we can do in our heads. We could start with rot 1 and rot -1 and rot 2 and rot -2 and then back to rot 1:
 !"#$%&'()*+,-./
0123456789:;<=>?
@ABCDEFGHIJKLMNO
PQRSTUVWXYZ[\]^_
`abcdefghijklmno
pqrstuvwxyz{|}
80bBIn'5m4antphRYd@y.  (original)
9/d@Jm)3n3cluojPZcBw/  (encrypted, if I made no mistakes)
Another option that might make it easier to work on without an ascii chart is to rotate numbers, letters, and the punctuation ranges separately:
0123456789
ABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrztuvwxyz
[SP]!"#$%&'()*+,-./
:;<=>?@
[\]^_`
{|}
But that means the punctuation in the phrase remains punctuation, which gives the attacker information that you may not want to give.

Anyway, the problem is that the password looks like a password. In fact, it looks like a hard password.

Wait. How about if we take a note we made to ourselves in our pocket planner --
Milk and eggs, no bread.
and do a rot 2 on that?
Oknm"cpf"giiu."pq"dtgcf0
That might make a sort of nice password, except for the spaces being turned into quotes.


Remove most of the spaces and the period, and l33t$pe@k it a little, add "today":
mIlk&egg5, 0Bread2d@y
but we don't actually write that anywhere. Maybe we can remember that much. Rot2:
oKnm(gii7."2Dtgcf4fB{
and there's a nice password. (And a longer string for a passphrase.)

See how that works? The encrypted form looks like an innocuous phrase. It doesn't catch anyone's attention. Maybe you never actually write it in your pocket planner.

It takes a little time to do the decryption in your head, but you'll remember that password pretty soon.

(I'm working on a little tool to do that on my computer or tablet. I'll blog about that in my programming fun blog sometime.)

Don't try to use exactly what I have demonstrated here. Pick and choose, use what you can understand and work with.

Thursday, December 29, 2016

passwords, passphrases, public key, and tigers and bears, oh, NO!

Here's a thread from the Ubuntu user list that might be informative on the subject of passwords and passphrases:
https://lists.ubuntu.com/archives/ubuntu-users/2016-December/288591.html
Here's my take on the differences between passwords and passphrases:
https://lists.ubuntu.com/archives/ubuntu-users/2016-December/288644.html
I decided to unpack that post below:

[JMR201612302137: It has been pointed out that I have used a bit of non-standard explanation below. I'll try to make my explanation more compatible with current standard practice. While I'm at it, I'll fix my grammar and spelling, silently. (heh heh) ]



Passwords vs. Passphrases


I think the distinction has become fairly general in practice --

Passphrases are assumed to be used in indirect authentication, like public-key.

Passwords are assumed to be what are used when directly authenticating.

And also, there is a difference in what they look like.

Reasonably good passwords are assumed to be strings like
m0n<e4UR@Tom
and reasonably good passphrases are assumed to be more like
I have a monk{y for your atom, Kite.
Both are intended to be easy for the owner of the password or passphrase to remember, but hard for other people to guess.

Which means that, now that I have posted those two examples to the list and here, neither of them is any good for either me or the people on the Ubuntu user list. Or for you, now that someone might know you have read this post.

And here's a good question for you -- which one would you find easier to remember? Which do you think would be easier to guess?

(And I won't tell you the answer, because I would be telling you only half the truth, either way.)

Good Passphrases, Bad Passphrases


You may be wondering whether "m0n<e4UR@Tom" might actually be considered a passphrase. It's close, and because it is, it has certain relative weaknesses as a password. As a passphrase, you might think you want to use it like this:
m0n<e 4 UR @Tom
And that would be a harder passphrase than
I love my dog.
But it would also be harder to remember correctly. But, in fact, dividing it into words makes it easier to reverse-l33t-$peak. (Passphrases will tend to be attacked as strings of words, rather than as strings of characters.)

You do want a few more words and a little less easily parsed meaning in your passphrases.

One reason you should prefer to use a passphrase is that it (should be) easier than a "good password" for you to remember and harder than a "good password" for the attacker to guess or brute-force. (Did I already say that?)

Now, you might also be wondering, is "m0n<e4UR@Tom" a good password?

Is it good, aside from what I pointed out above, that it has now been seen on the internet, and it is therefore quite probable that it will find its way into the dictionaries that brute-force attacks use?

If you think you want to use it, don't use the same words. You could use the same ideas, or slightly different ones. Maybe use a different animal, a different sub-atomic particle, and a different preposition, etc.

But it is also too short for anything really valuable:
  • It might be stronger than necessary for your password for the Ubuntu user mailing list. 
  • It might be about the right strength for your blog. 
  • But for your bank, it's a little too weak. You would want a stronger password or passphrase. 
Remembering is important, but the real value for passphrases lies in their assumed use in indirect authentication methods, and in the change in habits that the indirection allows.

Indirection?

Tokens, Keys


Before I explain that, let me explain the word "token" and the word "key", along with a few other concepts:

A token is a hard-to-guess string used as a password or passphrase, or something similar.

Tokens for humans to use tend to have some apparent meaning whereby they can be remembered.

Tokens for machines to use do not need to have any apparent meaning.
3iH$lqp9Pz"Qgf!#|]sQQ2_bAd*
might be a good token for a machine to use.

(Twenty-seven characters is an odd length, which will be wrong for many purposes.

Coming up with a lot of strings like that is actually pretty hard work for a human, but a carefully written program can do it relatively easily. Well, easy for the user, that is. The programmer/engineer has to work pretty hard on such programs to get them right.)

A key is a token used to protect, for example, a database of tokens.

A database of tokens is a folder or file with OS level protections set so that only the owner of the tokens is able to access them -- without removing the storage device from the information device or booting the device to a different OS.

It should also be encrypted, since removal and rebooting are not that hard. The key to the database of tokens is the key used to encrypt it.

Indirection:


The key in the present case is the passphrase that you use to decrypt the tokens in the database.

You probably have an agent application that actually handles the tokens, instead of decrypting them and passing them manually. You tell the agent your passphrase, and it handles the rest. That way, you only need to remember your key to the database. That key becomes, indirectly, your key for all the applications, web sites, etc., that the agent application can interface with.

Indirection is a way of managing all that stuff.

Public-key Encryption


One more thing to understand is public-key encryption. The idea is simple:

It's as if you have a physical safe with two physical keys.

One key allows you to put things in it. You protect this one carefully, maybe you keep it on your person at all times. You never lend it to someone for any reason, and you never even let someone with a camera or photographic memory get a good look at it.

The other only allows you to get things out. You can't use it to put things in the safe. You give copies of it to anyone who needs to get things out of the safe.

If you keep the first key private, everyone knows that whatever is in the safe is something you put in there.

That's the allegory. It isn't a perfect fit, but allegories rarely are.

With public key encryption, you have two keys. You absolutely must keep your private key safe from being seen or copied.

With the private key, you encrypt things. [JMR201612302155: Specifically, you encrypt things that people need to be able to satisfy themselves were encrypted by you. And you can say that your having encrypted the message is equivalent to your having signed the message.

The process is usually a little different from simple encryption, and it is usually called signing, rather than encrypting. ]

With the public key, you can decrypt what you encrypted [JMR201612302155: , or validate what you signed, ] with the private key.

[JMR201612302155: Now, if you are thinking about this, it might not make sense. Usually, as you might be thinking, you encrypt things you don't want people to read. But everyone can get the public key. That's why it's called public. So why encrypt it?

Repeating myself, because it may be a little difficult to accept as reasonable, if we think simply, we could encrypt the message with the private key and distribute both the message and the encrypted message. ] People who want to prove you encrypted, or signed the message decrypt the encrypted message. If the messages match, you must have encrypted it.

[JMR201612302247:

(Or your private key has been stolen. You need to be able to tell the world that the key has been stolen. This is called revoking a key, and I won't explain it much here, except to note that publicizing a revocation is considerably more difficult to get right than publicizing a public key.)

The public key doesn't look anything like the private key, of course. The public key and the private key will not appear to be related. ] We don't want people to be able to calculate or guess the private key from the public key.

Now, encrypting an entire known message with a private key will provide an attacker with a lot of clues about the private key, so we don't really do that.

But what we really do is somewhat similar to what I've described above. Among other things, we produce something called a checksum that serves as a short, and nearly unique summary, or fingerprint of the original. Any change to the message should change the checksum. And then we encrypt the checksum.

And there's a little more to it that that, and you should take the time to read and understand the full story at some point.

[JMR201612291328:

I should point out that the encryptions themselves are known to be vulnerable. But the vulnerabilities are assumed to be hard enough for ordinary use, at present.

]

[JMR201612311415:

Uses of the Public Key


This may feel counterintuitive as well, but having people encrypt things using the public key is an option.

Even the the keys are considered asymmetric, in many implementations the specification of one key as private over the other is arbitrary. Either key can be used to encrypt a message, and the other key used to decrypt it.

This can be useful when connecting with an external server. Think about the spoofing domain names and IP addresses. Without something more than the domain name, or even the IP address, you don't really know that the server you are trying to connect with is the server you think it is.

That means you are going to try to log in to a server that may be a bad-guy, and it's going to ask for your password or passphrase.

How do you avoid telling the bad-guy pretending to be your external server your important, secret password or passphrase?

(Even the lowest-level MAC on your network interface can be spoofed, so the MAC is no help, either.) A bad-guy server can tell you anything that it knows, and you have no way of telling who told it to you.

If your workstation and the server have exchanged public keys in the past, you can send the external server a message encrypted with its public key, and it can decrypt the message. Then it can send you an appropriate response encrypted with your public key and you can decrypt the response.

Again, there's more to the exact process, but, if the exchange is successful, you can have a fairly high level of confidence that the server you are trying to connect to is the one you think it is.

Then you can send it your login name and password or pseudo-random token or one-time-password or whatever with a fairly high degree of confidence. And, since asymmetric encryption consumes processor resources, you can exchange the encrypted symmetric session keys that will be used to encrypt the session data traffic.

]

The Value of Security in Authentication


This all sounds very cool, very technical, very much like something you might get away with depending on.

Don't be fooled. The technology is not perfect. It's good, for now, if you know what you are doing.

If you don't know what you are doing, you can make simple mistakes that render the whole thing meaningless for a set of keys you have generated.

On the other hand, if you don't know what you are doing, or, more accurately, why you are doing it, you could waste a lot of time on unnecessary security measures.

I'm not going to lay the arguments out here, but the value of both passwords and passphrases is significantly reduced for the user who refuses to try to understand the nature of the attacks that are in use, and in use by whom.

You have to know yourself, know your enemies, know your friends, and know your strangers. You need to know the streets (wires) on which you travel (communicate).

Technology is easy to collect, and it quickly turns into cruft. Developing habits that allow you to try to understand the threats you face and where they come from should have higher priority than implementing security measures that you don't really understand.

[JMR201612291328:

At some point, we, as a race, are going to have to learn that nothing we have is really all that valuable.

The truly valuable things we have are things that others can't take from us unless we agree -- not even by force.

And the most valuable things we have are strangely ordinary things that seem not to obey the laws of physics. The more we give them away, the more of them we have to give away again. Giving them away is what makes them valuable.

]

(I talk a little about security in this post, and other places in this blog, and in some of my other blogs, especially the one on freedom not being free. You might find what I say useful, or you might not. I'm not perfect, either.)

[JMR201612301549:

And I've posted a little rant that might help remember passwords and passphrases, and might help understand simple encryption, here:
http://defining-computers.blogspot.com/2016/12/how-do-you-remember-passwords-and-pins.html.

]