Archive for December, 2009

Fixed Width Coding


No pro-programmer I know prefers a variable-width typeface to code. On first thought, a fixed-width typeface makes it easy to align statements and numbers. But one can get all that, and more, using tab-stops. So, on second thought, why do we prefer fixed-width typefaces to code? Well, in coding, things that are similar should look similar. Here are three similar lines of code, in fixed- and variable-width.

aap_noot[iii] := 4.1;
aap_noot[mmm] := 4.2;
aap_noot[jjj] := 4.3;
  aap_noot[iii] := 4.1;
aap_noot[mnm] := 4.2;
aap_noot[jjj] := 4.3;

The variable-with code, is more dense, but look how hard it is to spot the error in the variable-with code, where I hit the key left to the ‘M‘ by accident. This error would have been more obvious in the fixed-width typeface because it differentiates glyphs better. Non coders usually don’t understand why this is important. But than again, they never lost three days of work because the ‘I’ looked like an ‘l’. There are special coding-typefaces that provide good differentiation within the glyph groups 0Oo and 1liIL. There is much more to it, so if you are using whatever typeface came with your IDE, you might want to spend some time on this subject. For example read what Dan Benjamin has to say [Top 10 Programming Fonts].

To Win Before You Code

No Comments »

avb1I know several über coders that would start a private printer server with: gcc -xc -;./a.out&, type for 90 seconds and their server would be up and running. I like them, the are generally fun people. I would never hire them though. It’s like having your car gassed up for free and find out they welded the gas tank shut. avb2 I would hire the kind that would think and than write (or download) some code and refactor it until it was solid. That is like having your car gassed up for half price and find out your car now runs a 100 miles to the gallon. Or maybe I would not hire anyone by the hour, but simply use something like ”Programmers are most effective when they avoid writing code. [...] The romantic image of an über-programmer is someone who fires up Emacs [or IYFEH], types like a machine gun, and delivers a flawless final product from scratch. a A more accurate image would be someone who stares quietly into space for a few minutes and then says ‘Hmm. I think I’ve seen something like this before’ (John D. Cook) [Why programmers are not paid in proportion to their productivity]. Managers love to reward according to measure. But code productivity is hard, if not impossible, to measure. If team productivity is hard to figure out, it’s even harder to measure the contribution of individuals on that team. You can get a rough sense of a team’s output by looking at how many features they deliver per iteration. b It’s a crude sense, but you can get a sense of whether a team’s speeding up, or a rough sense if one team is more productive than another. But individual contributions are much harder to assess. While some people may be responsible for implementing features, others may play a supporting role – helping others to implement their features. Their contribution is that they are raising the whole team’s productivity – but it’s very hard to get a sense of their individual output unless you are a developer on that team (Martin Fowler) [Cannot Measure Productivity].

Debugging 27 Year Old Code

1 Comment »

Remember those days? About twenty seven years ago? You were on a role with Donkey Kong, playing at some high level—level twenty two, actually. And boom, time is up, you’d be dead. That is, Jumpman (“Mario,” to you DSi wielding youngsters) would be dead. Dead as a doornail, how unfair! My, then, therapist told me, that it was not as important as I thought it was, the moron. That I would forget it eventually, and go on with my life, double moron. How can I forget? I still cry myself to seep every night over how unfair life is. But now, thanks to Don Hodges, the world (“my world” as my current therapist refers to it) is back in harmony again. Thanks Don, for fixing Donkey Kong. I can sleep in peace, now.

This is what the (overflow) bug looks like:

0F7F   17 RLA        ; Rotate Left the bits in A
0F84   80 ADD A,B    ; A = A + B
0F85   80 ADD A,B    ; A = A + B
0F86 C628 ADD A,#28  ; A = A + #28 (40 decimal)
0F88 FE51 CP #51     ; Is A >= #51 (81 decimal) ?
0F8A 3802 JR C,#0F8E ; No, then skip ahead to #0F8E
0F8C 3E50 LD A,#50   ; Yes, then A = #50 (80 decimal)

Bijective Programming

No Comments »

In 1987 a friend of mine asked for some help with his (Turbo) Pascal home assignment. He had to write a program that would convert Roman numerals to Arabic. A classical exercise. He had written some 200 LOC. He noted that graceful handling wrong input necessitated a lot of if statements and cleanup code. It made the code bloated and hard to read. He still was unsure if his code would catch all possible wrong input—it did not. I explained to him that the bijection (i.e., converting from Arabic to Roman) was straight forward. So we wrote 20 LOC that would, for any Arabic integer input, generate the correct Roman string. Than we refactored his code until about 30 LOC were left. It would generate the correct Arabic integer for any well formed Roman string, and some integer value for any other input. Than we wrote some 10 LOC that would take a string convert it to a number and back to a string and compare it to the original input. If they were the same, it would proclaim the answer, otherwise it would print something like this:
I'm sorry Dave, 'IM' isn't Roman, did you mean 999 (CMXCIX)?
His solution would baffle his classmates, yet he got one of the lowest notes, because the (math) teacher was totally confused about the Arabic to Roman part—plus we did not know his name was Dave. The teacher failed to see that the bijective programming, as usual, resulted in very clean and robust code, with output that bordered on AI.

Code Locality


BlumaMost coders agree that non trivial [p]rograms [...] obey the principle of locality. (P.J. Denning and S.C. Schwartz) [Properties of the Working-Set Model] This principle is also known as the locality of reference and is divided in two types: temporal locality and spatial locality. However I postulate there is a third type: code locality. Ideally, source code consist of relative small blocks, each at their own level of abstraction, fully understandable in isolation. This makes sense if you realize that our intellectual powers are rather geared to master static relations and that our powers to visualize processes evolving in time are relatively poorly developed. (E.W. Dijkstra) [Go To Statement Considered Harmful] This is why a goto needs a limited scope, a global variable needs to be constant, and multithreading needs queuing. Most agree that there exist severe limitations on the amount of information that we are able to receive, process, and remember. By organizing the stimulus input simultaneously into several dimensions [abstraction layers] and successively into a sequence or chunks [code blocks], we manage to break [...] this informational bottleneck." (G.A. Miller) [The Magical Number Seven] To prevent brain overload code locality is necessary. It will turn writing (or just understanding) code into a sequence of separate tasks where one task can be completed before the next is undertaken. "Completing the task means resolving the tension system [...] If a task is not completed, a state of tension remains". (Bluma Zeigernik) [On Finished and Unfinished Tasks] (Hint: click the photo.)

Premature Optimization

No Comments »

en_1212_upI have a confession to make. When I code, I feel like a deity. Honest I do. Rules don’t apply to me. In this domain, I make the rules. Especially, the severe penalties for premature optimization don’t apply to me, or so it feels. I don’t mean that in a big way, I mean that in a million small ways. Not premature optimizing is so medamn awful hard. I know Pareto, yet I just can not contain my self 20% of the time, so I get 80% of the penalties. I admit, I once wrote i += i < 0 ? -1 : 1; to save a boolean. But no more! We can learn to help each other. If you, after coding clean for a week, broke an abstraction layer, or if you, after coding clean for month, drop below the 1:3 remark:code ratio, or if you, after coding clean for a year, suddenly have a laps and spend six hundred and sixty six hours on my_malloc(), than join the POA.

Premature Optimization Anonymous® is a fellowship of men and women who share their experience, strength and hope with each other that they may solve their common problem and help others to recover from premature optimization. The only requirement for membership is a desire to stop premature optimizing. There are no dues or fees for POA membership; we are self-supporting through our own contributions. POA is not allied with any sect, denomination, politics, organization or institution; does not wish to engage in any controversy, neither endorses nor opposes any causes. Our primary purpose is to code clean and help other premature optimizers to achieve sobriety.

First meeting of the Dutch POA is in the RAI in Amsterdam, december 31st.

Names Should Constitute Knowledge

No Comments »

When you know all the names, in every language, of that bird, you know nothing but absolutely nothing about the bird. (Richard Feynman) This is mostly true for normal life, but not for solid code. In fact names constituting knowledge is the the basis of good code locality—meaning that code should consist of relative small blocks, each at their level of abstraction, fully understandable in separation.

Skip List

No Comments »

skiplistThe skip list is a relative unknown data structure. If you know it already you can stop reading. If not, keep reading because “the algorithms for insertion and deletion in skip lists are much simpler and significantly faster than equivalent algorithms for balanced trees.” (William Pugh) [Skip lists: a probabilistic alternative to balanced trees]. Speaking from experience, I have to say this is true. Coding up skip lists is easy. I’ve written a reentrant version that uses read/write locking, a version for a hash table to replace chaining, a version that allows Θ(log(n)) indexing, and an unrolled version. It is not perfect for all problems, but it is adaptable and has some other nice properties. However, I like it best for its simplicity. To find the 7 in the above skip list, we start at the top left, and go right until we hit a value >7 (or NIL). We go down one and continue to the right, and so on. So we visit the nodes 1, 4, 6, and 7. Search usually is Θ(log(n)). Since the reentrant version is somewhat less trivial to write, click this link for a GPLed implementation.

NULL’s a Bitch

No Comments »

Anyone that wrote any non trivial SQL query with an outer join, will know: NULL‘s a bitch. By the time you learn to write:

SELECT * FROM T  -- contains bitches.

your hair has turned gray, if you did not pull it all, that is. When the ω—the symbol for ‘unknown’—was introduced by Codd, I guess he basically tried to support ternary logic. Later, he must have realized that ω, in non boolean columns, can stand for many different problem specific meta-values. In version 2 of The Relational Model for Database Management (Chapter 8), he tried to remedy the problem by splitting NULL (ω in SQL) into A-mark (applicable-value mark) and I-mark (inapplicable-value mark). But first, this sinned against Bruce MacLennan’s zero, one, or infinity principle and second, almost nobody adopted it. Hence different SQL implementations handle non trilogic usage of NULL differently. For example the expression 'A' || NULL || 'C' has different values in different SQL implementations. The same goes for aggregates, COUNT(*), NULL / 0, and indexing. Also a whole host of NULL related functions were invented like COALESCE(). Since any usage of NULL can be mimicked by adding a boolean column and some careful code, I am of the opinion that the introduction of NULL was a huge mistake, and the current standard and implementations aggravate the problem by implementing a lot of undefined or implicit behavior. Other solutions that would have been better are (user-definable) meta- or default-values. Anything but NULL.

Forget My Number

No Comments »

Sometimes impossibles find their way into your code specs, like an intuitive GUI component that will make complex selections simple, or a wrapper that will make application objects into distributed objects, transparently. The list is endless. This is when all those theory courses, finally, pay off. You just know it is impossible instantly, where you otherwise would waste days trying the impossible. Or worse, you actually implement something. Because of the world wide web, nowadays, you can skimp on theory classes and use Google. For example, Merchants who take credit cards over the Internet for payment need to get rid of the number the microsecond after they’ve placed a hold on the funds, because it is impossible to store that number in a form that is both searchable and secure. (Cranky IT Guy) [How to secure a credit card number] (via: DI) You don’t have to know, anymore, about Knuth, Shannon, Dijkstra, Chomsky, Turing, Erdős, Parnas, Minsky, and the many others. But they are still fun!