<sub>2025-04-27 @1200</sub> #digital-garden #lua #programming #luajit #lua/learning-lua # Table Looping Thoughts I've been giving more thought now to loops after writing some recent Neovim plugins. I really like the simplicity of using `table.insert` especially when performance isn't a concern. However, what if performance is a concern? What could I say about `table.insert` in contrast to other methods? Would I be correct to even assume a more performance oriented approach? ## The idea ... I've seen so many examples out there that attempt something like the following using a length-based append over `table.insert`. The idea behind using `t[#t+1] = val` is to avoid function call overhead, minimize branching, and eliminate potential element shifting. ```Lua #!/usr/bin/env luajit local N = 1024 * 1024 local function loop() local out = {} for i=1, N do out[#out+1] = i -- alternative style being -- table.insert(out, i) end end ``` ## The reality ... Even though I would think `out[#out+1]=...` has fewer instructions and no function call overhead it ends up being expensive on larger tables. The expense being `#out` always getting recomputed. If you look at the source (Lua 5.1, Lua 5.2, LuaJIT) the evidence suggests that $O(1)$ length lookup cannot be guaranteed due to the possibility of linear scanning (which is $O(n)$). ```C // Lua 5.1 reference (lobject.h) #define luaH_getn(t) (luaH_getn_impl((t))) ``` For awhile I thought that perhaps the net effect on tables with smaller lengths would be faster with using `out[#out+1]=val` but I don't see that being the case. The fact is `table.insert` is implemented in C and aggressively caches the table's length, making it more efficient for growing tables than recomputing `#out` on every insertion. ## How can Lua even guarantee $O(1)$ with `#` In Lua, tables are a hybrid of an **array** part (for sequential numeric keys starting at 1) and a **hash** part (for everything else strings, sparse numbers, etc.). For example, if I declare a sequential numeric table, Lua will optimize it into an internal array: ```Lua local t = {1, 2, 3, 4} ``` Internally, I know that Lua stores these entries densely in the array portion for fast access. However, at any point, a table can lose its pure array property. ```Lua local t = {1, 2, 3, 4} t[2] = nil -- create a hole t[10] = 42 -- insert at a weird index ``` Now, Lua may partially (or fully) move data into the hash part, making efficient array behavior impossible to guarantee. When I ask for the length (e.g. `#t`), Lua must determine where the sequence ends. Internally, it does this via the function `luaH_getn_impl`, which can trigger a linear scan over the table thus ... - If the table is dense and contiguous, Lua may check the cached `sizearray` give $O(1)$ behavior. - If the table has holes or becomes sparse, Lua must scan linearly, degrading to $O(n)$ behavior. So again, Lua cannot always guarantee $O(1)$ time for the `#` operator. So what if the table remains dense. ```Lua local t = {} for i = 1, 1000000 do t[i] = i end t[1000000] = nil -- remove last element print(#t) --> 999999 ``` Here, Lua can quickly find that `t[999999]` is the last non-nil element because no internal holes existed before the last deletion. Ok but what if holes appear? What will change? ```Lua local t = {} for i = 1, 1000000 do t[i] = i end t[500000] = nil -- hole in the middle print(#t) ``` Lua still prints $1000000$ because `t[1000000]` is still non-nil. Lua finds some boundary (not necessarily the first hole), satisfying the formal rule. > The length of a table `t` is any non-negative integer $n$ such that $t[n] \neq \text{nil}$ and $t[n+1] = \text{nil}$. ## One more thing ... If we still wanted to pursue something other than `table.insert` we could consider using an explicit index. ```Lua local N = 1024 * 1024 local function loop() local out = {} local n = 0 for i=1, N do n = n + 1 out[n] = i end end ``` At least here there would be no `#out` recalculation, no `table.insert` with just a direct indexed assignment. ## Conclusion When growing Lua tables or iterating them, the method you choose matters more as the table size increases: - For small tables, both `out[#out+1]=val` and `table.insert(out, val)` are acceptable. - For large tables, `table.insert` is generally faster and more reliable than `out[#out+1]=val`. - For maximum control and performance, maintaining your own insertion index is the best approach. Understanding these tradeoffs helps when micro-optimizing LuaJIT code, particularly in tight loops or plugin development.