Log in

Heap: the invisible data structure - Luke's Weblog [entries|archive|friends|userinfo]
Luke Gorrie

[ website | My Website ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Heap: the invisible data structure [Apr. 9th, 2007|12:51 am]
Luke Gorrie
[Tags|, , ]

Imagine for a moment that all the lists in our programs suddenly switched to the performance characteristics of vectors, and vice-versa. Wouldn't that be awkward? Imagine even worse: that in SBCL they behaved as normal, in Lispworks they were switched, in Allegro they both behaved like balanced trees, and in OpenMCL it was all undefined. How would we write programs then, without depending on things we take for granted like fast vector random-access and APPEND only taking time proportional to the left-hand list's length?

It would be bloody hard. We'd have to avoid sequences as much as possible not only in code that has to be fast but also in code that just has to be predictable. Our programs would surely be awkward to read because of all the squirming to avoid calling sequence functions.

Thank goodness this is just a dream. It would be too silly.

Now imagine for a moment that an even more fundamental data structure lacked good and predictable behaviour that you could depend on: the pervasive garbage-collected heap. What then, I ask you?

Update: Feedback suggests that this post didn't make the point I wanted to. I hope my sbcl-devel post helps to say what I'd like to have as a Lisp user.

[User Picture]From: leon03
2007-04-26 05:50 pm (UTC)
Hey, I picked that thread up through LtU, and in the thread you mentioned that you had a new blog there too, so that brought me here.

Interesting thread; I do dabble in Erlang. You mention that Erlang processes each have their own separate heap, and that there are no inter-process pointers. The trick reminds me of stories my professors have that back in the day, it was standard practice to simply reboot the computer when memory ran out instead of letting the collector do it's work.

My understanding is that Erlang processes, at least on a single node, share a common heap and pass messages via pointers. Wait... that's a ./configure option, IIRC, so I guess I answered my own question. Which option is most often used in actual deployments?

I suppose you could have a many-to-one function from processes to heaps, using an option to spawn to specify a heap, allowing the programmer to use both models in conjunction.

Anyway, here's a few of my journal entries you might enjoy: journey to unemployability, queues, and concurrency.

Take Care.
(Reply) (Thread)
[User Picture]From: lukego
2007-04-29 08:15 pm (UTC)
Erlang historically used separate heaps for each process. More recently the HiPE group in Uppsala have made a shared-heap virtual machine (configure option) with different performance characteristics. I like to choose one behaviour and exploit it and that for me is separate heaps (partly because I'm stuck on OTP R9 at work). I also like to write code that annoys people in Uppsala (my favourite is applying {Mod,Fun} tuples) in petty retaliation for historical bickering with Kostis Sagonas (who is actually a very nice guy).

Nice blog and welcome to unemployability! I'm finding it's surprisingly lucrative. :-)
(Reply) (Parent) (Thread)