Between the Comments

Wednesday, August 12, 2015

What's in a Null?

Null values or nullable types have posed long standing problems for computer scientists and language implementors alike. Technically speaking, nulls represent missing or unknown information, which is a perfectly valid condition that actually occurs in real programming situations. On the other hand, they also present questions such as "are two nulls ever equal?" and "what is the semantic value of a null?", which are concerns that can easily make or break a system. In some languages, C being a notable example, nulls do not have any formal handling and can easily lead to errors.

The general consensus in computer science is that nulls are never equal to other nulls. The idea is that since a null represents an unknown value, and two unknown values cannot be proven to be equal, two nulls thus cannot be proven equal either. However, I should note that as unknowns they also cannot be proven to be unequal, either, so this solution is not fully valid but rather just an attempt to present a consistent practical solution. This is also the solution taken by SQL, and it's an important question since unknowns occur very frequently in database code, and represent important edge cases.

There is also another problem with nulls, since in some cases they can represent the absence of a property, which gives them similar meaning to boolean 'false' in conditional statements. In some languages, notably lua, nulls are semantically equivalent to false for this reason, but this introduces another problem. Consider a situation where you have a boolean that might also be null. In that case it may be necessary to be able to distinguish between null and false. Nullable types are a solution to this problem, but they don't formally address the previous problem, nor do they give a formal semantic meaning to nulls in general.

I propose a solution which completes nullable types as a formal answer to the null problem, based around the key observation that nulls are not merely unknown, but rather from the perspective of the program they don't exist at all. As a consequence nulls cannot be compared because their values do not exist. Nulls also cannot be (directly) equivalent to false, again because their values do not exist. Instead, nullable types have an additional property, which is that they do or do not exist.

By performing an explicit 'exists check' on nullable types, the 'null or not null' property can be properly converted into a boolean which can then be used to determine how to handle the value, ie whether it can be compared and used or not, without having to resort to voodoo semantics by casting them to booleans or unconditionally returning 'false' when comparing them. By requiring null guards for nullable types, and by providing such checks in the compiler, it should be possible to statically guarantee correct behavior which fully satisfies all the practical and theoretical properties of nulls.

As a side note, the most common usage of nullable types (which typically only appear in pure functional languages) is for error handling, ie if a function can fail it might do so by making its outputs null. I personally disagree with this usage, from both the perspective of nulls and from the perspective of error handling, which I'll talk about more in my next article.

Friday, November 28, 2014

On Process Schedulers and "Throughput"

What a Process Scheduler Does

To be fair to those who don't know every detail of how an OS works, I'll start by explaining what a process scheduler does and why it's important in the first place. Every operating system has a process scheduler, and it's a required function of the kernel, the lowest level part of the OS. The process scheduler determines what program gets to run next, and how long it gets to run before it's kicked off the processor so that some other process can run for a while (called preemption). The process scheduler is very important for system performance; it determines to a large extent how responsive the user interface will be, it determines how well the system can multitask and it can even affect how long things take to finish from the time they're started.

I think I should also mention what a process scheduler does not do, which is anything else useful. Every processor cycle spent doing scheduling tasks is time that hasn't been spent rendering web pages, processing game mechanics, compiling, decoding music or video files or anything else that the user wants the computer to do. In other words, the less time the scheduler spends scheduling in general, the better.

Performance Metrics for Process Schedulers

There are two meaningful performance metrics for schedulers, which are difficult to measure with any exactness although it is possible to get decent estimates and comparisons. The first is latency. Latency is basically how long a process spends waiting between the time it asks to run and the time it actually gets to run, on average. Latency has a large impact on things like games and movie playback where short response times are important, and lower is better.

The second is turnaround time. Turnaround time measures how long it takes to complete a compute-intensive job (like compiling) from start to finish. Again, smaller times are obviously better, since that means jobs are finishing faster, and we like things to be faster.

"Throughput"

Now, there exists a third metric which has become popular to cite, and has even been the central topic of certain high-profile CS papers, termed "throughput". "Throughput" measures how many processes per second the scheduler actually runs, but this does not actually count completed jobs, just processed scheduled. It is thought, wrongly, that 'throughput' is perhaps some average between turnaround times and latency, but in practice there is no correlation between throughput and any other measure.

On paper "throughput" sounds reasonable enough. Running more processes should result in better multitasking performance, or something like that. In reality though, the only thing they're measuring is how many times the scheduler preempts processes per second, and remember that each scheduling event is time not spent doing anything really useful, so more is not better. For turnaround times, the best result usually comes from long scheduling 'time slices', which is the opposite of 'high throughput'. For latency things are more complicated. Shorter time slices in general mean less waiting time, but even for latency-sensitive processes turnaround times are still important, so a balance must be struck to achieve good all around performance. However, that balance does not in any way correlate to "higher throughput".

Finally, a user cannot feel "throughput". A user can feel latency in terms of smoothness of the UI, video playback etc. A user can feel turnaround times in direct terms of how long cpu-heavy applications take to finish jobs (or how many simultaneous clients their web server will support). Even a modest improvement can be quite noticeable in either case, but there is no case where 'higher throughput' results in a direct improvement in the user experience. In fact, the ultimate 'high throughput' scheduler would schedule every application for exactly one instruction, and even if the scheduler were as fast as possible almost all of the CPU time on such a system would be spent scheduling. In practice I don't think a CPU will even allow slices that short, perhaps luckily.

It never ceases to amaze me how much nonsense like this gets repeated unquestioningly within supposedly educated academic circles. The linux community has also been big on 'throughput', although they're at least consistent in never knowing what they're talking about. There was even a research project where an AI scheduler was built to 'maximize throughput' for the linux kernel on large server systems, and if you look up that paper you find that they predictably adjust the time slices to the smallest possible value. And for the derp award, servers are also much more sensitive to turnaround times than latency. Great job, here's your PhD.

Ostrich Oriented Programming

While object-oriented programming is certainly useful, there are some ideas which currently dominate the OO community and influence language development that are not. The idea that OO should be used to "hide information" is perhaps the most central of these, and results in what could be better termed "ostrich oriented programming" or the "head in sand method".

The prime example of this flavor of crap is the NSMutableArray from gnustep/cocoa. Take an array, which in C is a straightforward little data structure, and then stick it in an object where you can't see any of its data except through convoluted access methods. Now you need an NSArrayManager to actually use your mutable arrays for whatever reason, and then you have to figure out how to use that. All to do something which started out as trivial in C.

Why? Because "hiding data is good"? What makes OO useful is that it allows you to combine related functions and data into one "structure" so that code is easier to organize, it isn't useful because it "hides" anything. Programmers need to access data for various purposes, and it should be simple and easy to do so. Sometimes objects can "hide" irrelevant details in terms of using them, which is nice but an entirely different subject. We're talking about an array and unless this particular array holds domain-specific information inside a domain-specific object, there's no reason to "hide" any of the data inside of it, at least not from our enclosing object at the very least.

Of course what they were probably trying to achieve was some illusion of "safety" or "security", however it is impossible to achieve security by hiding data. Bounds checking is present in many languages that don't even feature objects, so it's not like OO is required for that. For data clobbering, a pure functional approach is the only way to ensure correctness, and no amount of pointless layers of OO abstraction can fix it. The only thing they've actually managed to accomplish is to fill their libraries with countless useless "data-hiding" objects often in cases where OO is not even useful at all.

That's not to say that it wouldn't be useful to turn an array into an object. Imagine if you could write something like:

int myarray = array[4];
myarray = myarray.zero();
myarray[0] = 3;
myarray[1] = 4;
myarray[2] = myarray[0] + myarray[1];
//etc.

This way we could have the benefits of arrays being objects (ie built-in convenience functions) without losing the ability to access them directly when it's more convenient to do so.

As a side note, while the idiots who wrote "Design Patterns" may have been influenced by Christopher Alexander, Dr. Alexander's work should not be compared to theirs. Unlike the useless crap in "Design Patterns", Christopher Alexander's work is actually universally relevant to art and design, although it is not directly applicable to engineering. If they had actually stopped to consider what programmers actually need to do with programming languages on a daily basis as Dr. Alexander does with all his buildings, then perhaps "Design Patterns" would not be such an utter piece of garbage that it is. Alas, apparently their interest was only in selling copies of an expensive book full of gimmicks that they pulled out of their asses and then imposed on everyone in the form of authoritarian dogma. They seem to have succeeded at it, too.

Imperative is not Algebraic

From time to time I've heard the idea repeated that 'imperative programming is algebraic, while "declarative" programming is based on calculus'. Of course, they're probably referring to either lambda calculus in the case of functional programming (which, by the way, is useless to programmers) or predicate calculus in the case of logic programming (which is also pretty useless to programmers XD), but here I want to focus on this little "algebraic" bit of nonsense.

First, for those of you who haven't seen any recently, here's some examples of algebra:

y = 2x + 3

let x = 2 and y = 4 in z = 2x - y

f(x) = x^2 + 4x + 5

In algebra variables are defined in relation to other variables and constants, and may be stated in terms of a function as in "f(x)". Now here's some typical C-style (pseudo)code, demonstrating something that you would never see in an algebra textbook:

int x = 3;
x = x+2;
return x;

Note that here x is being related to itself recursively, which works because in C, x is assumed to be holding some current/previous value (3) and can be assigned a new value based on it (5). Algebra on the other hand has no such concepts of persistence or reassignment. Algebra also has no concept of performing actions, so "return x" is implicit in evaluating 'x' and is not a separate statement. Now compare this to some pure functional ML-style pseudocode:

let f(x) =
let y = 3
in
x + y

Note that I scraped out all the useless lambda-style syntax to make it more readable. Oh, and look it looks like algebra. Almost exactly like algebra in fact, except that you can refer to strings and other non-numerics. There are some other differences for example side-effects, but those aside functional languages, once you remove the nonsensical syntax, are much more like algebra than imperative languages are, and imperative languages can hardly be called "algebraic" by any standard. To all the CS majors out there who blindly regurgitated this crap in your scientific papers, shame on you, you know who you are.

Monday, November 3, 2014

Do What?

Code (n): a system used for brevity or secrecy of communication, in which arbitrarily chosen words, letters, or symbols are assigned definite meanings.

In the modern day with our multi-core multi-gigahertz processors, with our apps for everything even on our phones that we can take with us everywhere, and with decades of steady research and development in computer science, you would think that programming would be pretty easy. I mean anyone can download a free IDE and compiler and just start programming, right?

Sure, if you can learn to read greek and martian, make a prodigious leap from 'hello world' to rocket surgery and then put up with unintelligible libraries with accompanying mountains of opaque documentation. Oh, and it also turns out that most of that crap was useless in the first place or unnecessarily difficult to use otherwise, surprise. All for what? A programming language? A tool whose sole intended purpose is ostensibly to help programmers turn ideas into running applications.

Is programming really difficult though? In general math beyond algebra is rarely encountered, although bitwise operations are a notable addition to the usual repertoire. For most languages the logic isn't terribly complicated either. The biggest barrier to entry, really, is that too many programmers have learned to think and write in a jargon-ridden commercial or unix style. Intimate knowledge of various technologies or insider jargon is often assumed and explanations tend to focus on obscure features or quirks, if they're even recognizable as english at all.

The way most programming languages and their libraries are developed follows a similar pattern. You see a lot of quirky abbreviations that are inconsistent between languages, oftentimes syntax is obstructive and difficult to reason about and in most languages there are also at least some cases where some language feature conflicts with a clean solution. For all the talk about user interface design, the 'user interfaces' to most programming languages are downright atrocious. That's saying nothing about other important language design decisions.

Rarely do you see tutorials that resemble english enough to be intelligible, and which follow a reasonable enough progression in complexity to be useful. Rarely if ever do you see a library that's organized and labeled in a straightforward way. Rarely do you get the feeling that the language designers were really giving serious thought to how to make things simpler or easier, whether it be trivial tasks or more difficult problems (DSLs are beside the point).

Of course, there are also many very interesting problems to be solved and things to be improved upon for specific areas of software design, but when the foundations of all that software are built upon such backward thinking and sometimes misinformation it's kind of hard to ignore. The user is first everywhere else, so why are programmers second-class citizens when it comes to the languages and operating systems that they work with, and why do they insist on making everything seem more complicated than it really is?