1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
|
#+title: lowest case
#+author: santi
#+description: a lower case only blog, purely for aesthetics
#+hugo_base_dir: .
#+hugo_auto_set_lastmod: t
* home
:PROPERTIES:
:EXPORT_HUGO_SECTION: /
:EXPORT_FILE_NAME: _index
:END:
a lower case only blog, purely for aesthetics.
here I talk about discrete mathematics (mostly computer science), including compilers, language theory, type theory, computability theory, software correctness, formal verification, and any other (entirely theoretical, almost non-applicable) nerd topic you can think of.
* about
:PROPERTIES:
:EXPORT_HUGO_SECTION: /
:EXPORT_FILE_NAME: about
:EXPORT_HUGO_CUSTOM_FRONT_MATTER: :menu main
:END:
i'm leonardo santiago, a software engineer based in brazil. my interests are in compiler design, programming tools (emacs), functional programming, proof languages and a bit of devops using nix/nixos. i'm most comfortable in nix and rust, but i know a fair share of other languages.
currently, i work as a software engineer @ [[https://mixrank.com][Mixrank]], and you can find my curriculum [[./static/cv.pdf][here]].
if you feel like smugly responding to any of my posts (or just want to kindly send me a message), these are my socials:
- [[https://www.linkedin.com/in/leonardo-ribeiro-santiago/][linkedin]]
- [[https://github.com/o-santi][github]]
this blog was built using emacs' excellent org-mode and [[https://github.com/gohugoio/hugo][hugo]] (with the [[https://github.com/janraasch/hugo-bearblog][bearblog theme]]), and it's source code is openly available [[https://git.santi.net.br/blog][here]].
* blog
:PROPERTIES:
:EXPORT_HUGO_SECTION: /blog
:END:
** rust is not about memory safety :rust:correctness:
SCHEDULED: <2024-06-01 dom>
:PROPERTIES:
:EXPORT_FILE_NAME: rust-is-not-about-memory-safety
:END:
most of rust discussions nowadays revolve around memory safety, and how it is safer than C / C++ / zig / go / whatever language is being trashed on twitter that day. while yes, that is true - not that the bar for most of these is particularly high - what I think is the main point of the language is always glossed over: correctness. when one tries to criticize any of the aforementioned languages, one is answered with the following argument:
#+begin_quote
your program segfaults? skill issue
#+end_quote
but i'd like to make the counter-argument that, no, this has nothing to do with skill issue.
*** formal language theory
the first thing one learns when they're studying formal languages (the field that studies grammars, state automata, etc) is that the rules that describe a certain grammar must match *exactly* the ones that you want to include in your language. this means that there's a bidirectional relationship between the grammar you describe (which directly define the automata that parses that language) and the words[fn:: formally they are defined as a sequence of tokens in certain alphabet that the automata closures over. normally we think of "words" as the whole program that we're parsing.] that it parses (which are related to the semantics of the program, how it executes).
from it, it can be inferred that the grammar *must not* allow in the language any words that does not have defined semantics, and in the opposite direction, that the language should not specify semantics to any program that cannot be parsed by the rules of the grammar. both of these are required in order to make study of this grammar <-> language partnership fun, pleasing, and most importantly sound.
going beyond, formal language theory also gives you the knowledge that the execution of any program can be given as a set of grammar rules in an abstract machine (the most famous one being a turing machine). in the same way you can define a set of grammar rules to parse parenthesized arithmetic expressions using a stack automaton, you can define a set of grammar rules to model the execution of a C program, that, albeit super complex, can be modeled as a turing machine. this usually gets the name of C abstract machine, and is the basis for formal specification of behavior in the language.
and no, i'm not talking about modeling a C parser as a state machine (which probably is easier than most languages, if you ignore pre-processor stuff). i'm talking about modeling C *execution* as a language being parsed. drawing a parallel, when parsing parenthesized expressions, you pop things in and out of the stack to represent "balancedness", and in the same way, when "parsecuting" C code, you must write to memory, represent side effects, represent type casts and pointer conversions and everything *as part of the language*.
in the same way that you'd hope that a parenthesized arithmetic expression parser would recognize that ~(1 + 2) + 3)~ is an invalid expression, you'd expect that the C compiler would correctly verify that the following series of tokens is not a /well behaved/ program:
#+begin_src c
int foo(int * myptr) {
,*myptr = 5;
}
foo(NULL);
#+end_src
i say /well behaved/ because i can't say /invalid/. it is in fact defined by the spec that when you dereference a ~NULL~ pointer the result is [[http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html][undefined behavior]]. and this is C's achilles heel: instead of outright banning programs like the one above (which i'd argue is the correct approach), it will happily compile and give you garbage output.
framing it this way really exposes the fragility of C, because undefined behavior has to always be taken into account. and, by the nature of it, there is no way to represent it other than as a black box, such that, if your code ever encounters it, then literally all you can say is that *the whole result of the program* is undefined - that is, it can be anything. you cannot show properties, nor say what will happen once your program enters this state, as the C specification literally does not define it. it may come to a halt, write garbage to the screen or completely delete half of the files of your program, and there's no way to predict what will come out of it, by definition. in the lucky case, it will segfault while executing and you'll be extremely pissed off, but that is not at all guaranteed. this is akin to having a float expression with some deep term being ~NaN~, in that it eventually must evaluate to ~NaN~ and you can't draw any conclusions about the result of the expression (other that it isn't a number).
language designers and compiler developers are by no means dumb, and yes, they know much, much more than me about these problems. undefined behavior exists exactly because there must be parts of your code that your compiler *must* assume that aren't possible, so that it can correctly compile. for example, let's say that you inadvertently try to dereference a pointer that you have no knowledge about. the C compiler simply does not have enough information to know if it is ~NULL~, if it is still pointing to valid memory, or if the memory has been initialized, so it's approach is to simply emit code *as if* it was a valid, initialized, non-null pointer.
it is essential to realize that this is an *assumption*, and in almost most cases, the compiler does not care whether or not it was actually still valid, so writing to it may have a myriad of effects of different effects (none of which are the compiler's concerns). sometimes, your system might correctly intercept a read/write from invalid/null memory and raise you a signal, but that is not guaranteed.
and there are a huge number of tools to aid in finding undefined behavior in a code base, it's just that
1. they are not by any means standards of C development (not in spec and not in standard compilers) and
2. they are fallible and will always let some undefined programs slip by.
*** runtime exceptions are not the solution
most languages try to handle this by introducing some sort of runtime exception system, which i think is a terrible idea. while this is much, much safer than what C does, it still makes reasoning about the code extremely hard by completely obliterating locality of reason. your indexing operation may still be out of bounds, and while this now has defined outcomes, it is one of the possible outcomes of your program (whether you like it or not), and you must handle it. and, of course, no one handles all of them, for it is humanely impossible to do it in most languages because:
1. it is hard to know when an operation can raise an exception, and under which conditions.
2. even if documented, it is never enforced that all exceptions must be gracefully handled, so some random function in a dependency of a dependency may raise an error from an unexpected corner case and you must deal with it.
this is a symptom of virtually all modern languages, and none of them have any good answers to it. java mandates that you report in your function type signature the errors that it may raise (which is a rare java W), but it does let you write code with [[https://docs.oracle.com/javase/tutorial/essential/exceptions/runtime.html][unchecked exceptions]] that won't signal a compile error if ignored, which eventually will crash your minecraft game. python, ruby, php and most other languages (even [[https://www.tweag.io/blog/2020-04-16-exceptions-in-haskell/][haskell]] made this mistake) do not even attempt to signal when a function might raise an exception. javascript somehow manages to be even worse, by having horrible implicit-by-default type casts, having undefined AND null, using strings as UTF-16, using floats as standard numbers, implicitly inserting semicolons, and honestly the list could go on forever.
the root of all these problems is, quite literally, the same: that your compiler (or interpreter) lets into your program execution states that you didn't anticipate for. one of the best of examples of the opposite, surprisingly enough, is regex matchers. while i concede that their syntax can be extremely confusing, they have the best property of software: if they compile, they work exactly as intended - which i henceforth will call *correctness*. this is because regular languages' properties and their state automata have been studied to extreme depths, and it is entirely possible to write a regex implementation that is *correct* (in the same way as above), going as far as providing formal verifications of that [fn::the excellent software foundations book [[https://softwarefoundations.cis.upenn.edu/lf-current/IndProp.html][explains thoroughly]] how one might formally write one possible regex matcher, and prove that the implementation is correct].
from this definition of *correctness* we can also derive a semantically useful definition for the word bug: an unexpected outcome for the program, that shouldn't be allowed in the language. of course java behavior might be defined for all inputs (for the most part, i'm sure there are might be problems here and there) but just because one possible outcome of program is ~NullPointerException~ doesn't mean that it is *expected*, making it, by my definition, a bug.
*** make invalid states unrepresentable
what the regex example makes clear is that the key to correctness is to make your language tight enough to have defined and *desired* output for all possible inputs. this is not to say that it won't raise errors; much to the contrary, it must have parser errors saying that some strings aren't valid regexes. instead, it means that all errors are *predictable*, and *well defined* (in some sense).
you, as the programmer, are then in charge of ensuring that the resulting regex program actually solves the problem you have at hand. want to match 3 words of 2 digit numbers followed by a capital letter? great, they can do that. want to match balanced parenthesized expressions? sadly, regex is incapable of ever solving that, because that language [[https://en.wikipedia.org/wiki/Pumping_lemma_for_regular_languages#Use_of_the_lemma_to_prove_non-regularity][is not regular]], so no matter how hard you try it will never solve it.
in a way, there's a beauty in how C sidesteps this: it defines one of the possible program outputs as being /undefined/, and it is on the programmers behalf to tightly ensure that the program has 0 paths to /undefined behavior/. in fact, it is probably one of the most [[https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf][well specified languages]], which is what makes it suitable for writing formally verifiable programs [fn::through the use of external tools like coq's [[https://vst.cs.princeton.edu/veric/][verifiable C series]]].
the main strength of rust, and where it differs from all mainstream languages, is that it has a very clear focus on program *correctness*. the raison d'être of the borrow checker is statically assuring that all references are pointing to valid memory, such that it is literally impossible for any borrow be null or to point to some freed memory (modulus [[https://github.com/Speykious/cve-rs][implementation errors of course]]). this completely rules out this possibility of bugs from the language we're trying to "parse". remember the C excerpt from above, where i said that the compiler should rule out the program as invalid? well, it is literally impossible to write that sort of program in rust, because one cannot construct a ~NULL~ reference.
not only that, but rust languages features makes it so, so much easier to write *correct* software: sum types (tagged unions), ~Option~ instead of ~NULL~ (which in and of itself is amazing), ~Result~ for errors (making obligatory to handle all possible branches your program can take), a strong and powerful static type system, and ditching inheritance and classes in favor of traits.
note that i never ever talked about memory safety. even in a world where C wasn't in fact full of memory vulnerabilities, rust would still be miles better, because it statically assures you that the *meaning of your program is tightly reproduced by the code you've written*. it is, by design, more correct than C, and the only way a problem can possibly happen is by side stepping rust static checks by using ~unsafe~.
it is just a happy coincidence that this leads to a language that isn't garbage collected, that is relatively lean, fast, easy to embed, has good ergonomics and that enables you to write asynchronous and multi-threaded programs. these properties are awesome to boost rust to a very well regarded status between developers, but aren't at all related to languages that enable you to build reliable, correct software. out of curiosity, i'd happily defend the case that [[https://coq.inria.fr/][coq]] is also one of these languages, and it absolutely does not hold any of these properties.
*** software engineering as a craft
finally, i think this relates to how i personally model the software development job as a whole. it starts by having some problem you think you can use computers to solve, and then follow 3 clearly stratified steps:
1. define how one might solve the problem. this usually means splitting it into several possible cases and treating each and every one of them separately.
2. define an abstract machine that executes the very same steps, *and making sure that it tightly adheres to your plan*
3. implement the very same machine in a language, *making sure that your implementation adheres tightly to your abstract machine*
the part that programmers usually get paid millions of dollars for is the step *1 -> 2*, which is by far the hardest and that requires the most creativity and craftsmanship. what usually makes people say that [[https://www.youtube.com/watch?v=FeAMiBKi_EM][software is in decline]] is that we don't learn the value of executing step *3* properly. this leads to sloppy, half baked software that crashes when X Y Z happens, and we've just come to terms with software being so brittle.
it is not by chance that Yang et al. could only find measly 9 bugs after 6 CPU years of fuzzing in [[https://compcert.org/man/manual001.html][compcert]], a formally verified c compiler (written in coq), where as in gcc and clang, they [[https://users.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf#subsection.3.2][found and reported more than 300]]. all these 9 bugs where in the unverified front end of the compiler (the parser), and there were literally 0 middle end (compiler passes and AST translations) bugs found, which is unheard of. this is not by chance, they've spent many years writing proofs that all of their passes are correct, safe, and preserve the meaning of the original program.
i really think software developers should strive for that kind of resilience, which i believe can only be achieved through properly valuing *correctness* . i don't think it is reasonable to expect that all software be built using coq and proving every little bit of it (due to business constraints) but i think that rust is a good enough language to start taking things more seriously.
** self host the world :nix:nixos:
:PROPERTIES:
:EXPORT_FILE_NAME: self-host-the-world
:END:
I've known since forever that google is not to be trusted, and that [[https://killedbygoogle.com/][they whimsically create and destroy]] products like no one else. I've also been a not so proud owner of a google mail account for the past 15 years, that I rely on for almost all my services.
honestly, those facts didn't bother me that much, because, like everyone else, I'm always under the impression that /it isn't going to happen to me, right/? that was, until june of last year, when [[https://www.theverge.com/2023/6/16/23763340/google-domains-sunset-sell-squarespace][google sunset'd google domains]] - which I was under the impression to be the king of domain registrars - plus the rise of AI and LLM's seriously made me question the hegemony of google search in the current state of affairs, and how well I was positioned to a possible google meltdown as a whole.
of course, I don't think that gmail is going anywhere soon, but it tipped me off into searching into the world of self hosting. I mean, how hard could it be to host my own emails right? I wanted to find it out using a home device and nixos, in order to get declarative and reproducible systems for free.
*** the raspberry pi
i managed to get my hands on a raspberry pi model 4B in december 2023, but at the time I didn't have time to try and get something running on it. it was just around april or may of 2024 that I actually started to try to get it working. at first, I wanted to go with a completely headless nixos setup, by writing a proto-configuration akin to my [[https://git.santi.net.br/nixos][current ones]], exporting it as an sd image and flashing it to the pi, while baking in my ssh key. thus, no manual installation process would be necessary, and just inserting it to the pi and turning it own would be all.
sadly it didn't work, as in it would turn on but never appear as a network device, and given that detecting it through the network was the only way to interact with it [that I knew of], it left me in a dead end. at the time, I believed that it was because I was doing something wrong in the nixos configuration - maybe some network config mishap? maybe I forgot to turn something on? - but given that I couldn't see it's video output, I just gave up and decided to buy an HDMI adapter and do it the normal way.
of course, I only bought the hdmi adapter about 3 months later, and only then did I try to install nixos manually. I went with the normal approach: downloading the bare arm image, fetching [[https://git.santi.net.br/nixos][my nixos repo]] locally and rebuilding to install everything. only by having visual confirmation did I understand that the problem wasn't with my original nixos image, but rather with the fact that it was shutting down after the first boot phase!
it made me realize that I had never given proper thought into buying a real power supply, as I thought that connecting it through a usb-c cable I had lying around to my computer's usb port was good enough. I was able to gracefully connect the dots and realize that most likely it was rebooting because it didn't have enough power to boot, so I switched it to a 5 amps 3 volts cellphone charger I had to spare and it finally booted correctly!
*** networking issues
after that, I figured that I'd like to be able to not only turn it on, but also connect to my raspberry pi from outside my house's network. sadly, my router's public ip changes pretty much every day, so my only real option was to use ddns + a domain name.
I bought =santi.net.br= cheaply and quickly transfered it to cloudflare, as I wanted to get some ddns action going on. as I'm using the ISP provided all-in-one [shitty] router, it's not shocking to say that trying to open the relevant ports (22, 80 and 443) in the default configuration interface wouldn't have any external effect whatsoever.
I found out that there was a way to get the "admin version" of the router's setup page, and through that I was able to get port 22 open to the public internet (even though changing it the normal way wouldn't do anything), but neither 80 nor 443 were pingable still. I even questioned if my network was inside a CGNAT, as that is very common in brazil, but my ip wasn't one of the common formats and I could access port 22 of my router's public ip just fine. I don't know how the ISP could be blocking it other than the router's admin page port forwarding setup being a no-op for some specific ports.
I fought with this problem for a week but eventually decided to give up and just setup cloudflare tunnels for 80 and 443 ports, and route all the subdomains through that. cloudflare turnnels work by serving as an outbound only connection, by using a ~cloudflared~ instance running on the pi to route the requests through. after using some stateful commands to generate credentials, the relevant piece of code to set this up in nixos is very simple:
#+begin_src nix
{
# ...
services.cloudflared = {
enable = true;
tunnels.iori = {
default = "http_status:404";
credentialsFile = "/var/lib/cloudflared/iori.json";
ingress = {
"santi.net.br" = "http://localhost:80";
"git.santi.net.br" = "http://localhost:80";
};
};
};
}
#+end_src
though, I couldn't really use these tunnels to connect through ssh, and honestly I don't know why. I believe cloudflare expects you to use their [[https://developers.cloudflare.com/cloudflare-one/connections/connect-devices/warp/download-warp/][warp]] tool to authenticate through ssh connections (besides ssh key auth?), but I thought it was too much trouble to configure yet another tool (in all my computers), so I chose to use the router's public ip + ddns with port forwarding instead. I tested pretty much all ddns nixos services exposed in nixpkgs, and the only one that worked reliably was =inadyn=:
#+begin_src nix
{
# ...
services.inadyn = {
enable = true;
user = "leonardo";
group = "users";
settings.provider."cloudflare.com" = {
hostname = "santi.net.br";
username = "santi.net.br";
proxied = false;
include = config.age.secrets.cloudflare.path;
};
};
}
#+end_src
**** remote rebuilds
given that my computers' architecture (=x86_64-linux=) and the raspberry pi's one (=aarch64-linux=) are not the same, I needed a way to either trigger rebuilds remotely, or to cross compile locally and =nix-copy-closure= to the pi. cross compilation can be enabled by setting =boot.binfmt.emulatedSystems=, but I don't really like that solution as it requires me to enable that flag on every computer I'd like to deploy.
instead, I went with the most barebones approach possible, [[https://www.haskellforall.com/2023/01/announcing-nixos-rebuild-new-deployment.html][nixos-rebuild]], by using the following command:
#+begin_src sh
nixos-rebuild switch --fast --use-remote-sudo \
--flake .#<remote> \
--build-host <remote-host-url> \
--target-host <remote-host-url>
#+end_src
this works because =--fast= avoids rebuilding =nixos-rebuild=, and passing =--build-host= forces it to build directly on the pi, avoiding the cross compilation issue. I still intend to use a proper build tool (most inclined to using [[https://github.com/serokell/deploy-rs][deploy-rs]]) but that is for the future.
*** self hosting
after setting up a way to connect to the pi from the public network, I could finally get some self hosting started.
initially, all I did was a simple setup where I added my blog's repository as a flake input that would serve the result of calling =hugo build= on it through nginx. it did look something like the following:
#+begin_src nix
let
blog = pkgs.stdenv.mkDerivation {
name="hugo-blog";
src = inputs.blog;
buildInputs = [ pkgs.hugo ];
buildPhase = ''
mkdir $out
hugo --destination $out
'';
};
in {
# ....
networking.firewall.allowedTCPPorts = [ 80 443 ];
services.nginx = {
enable = true;
virtualHosts."santi.net.br" = {
addSSL = true;
enableACME = true;
root = blog;
};
};
security.acme = {
acceptTerms = true;
certs."santi.net.br".email = "[email protected]";
};
}
#+end_src
it sure worked fine for the first couple of weeks, and it auto generated ssl certificates for me, which is convenient, but it had a glaring flaw: in order to change something, I'd need to push a new commit to the blog repo, =nix flake update blog= and then =nixos-rebuild switch= (remotely) on the pi, every single time. the whole process was unnecessarily complicated, so I sought out to setup a simpler one.
I vaguely knew that git repos had a notion of hooks, that can be run pre and post any command or action you take, but never had I implemented or tinkered with them. still, it occurred to me that if I could setup a bare git "upstream" in my pi, and set a hook to run after every commit it receives, I could run =hugo build= on the source files and generate a new blog in a known path, that I could then hardwire =nginx= to constantly watch. this way, it would be very much like the old setup that I had with github pages, except local and not depending on microsoft's ai products.
funnilly enough, mere minutes after searching for this idea on the internet, I found a [[https://andreas.rammhold.de/posts/git-receive-blog-hook-deployment/][blog post]] by Andreas that did exactly that. while searching, I also figured that it would be pretty cool to have a [[https://git.zx2c4.com/cgit/][cgit instance]] exposed that could track my changes in this "git repos" directory, so that I could really stop relying on github while keeping the code fully open source.
the main idea is to setup a git repository declaratively [of course] pre-baked with a =post-receive= hook file that calls =hugo build= with the directory we'd like =nginx= watch. Andreas' post shows exactly how to idempotently create (or no-op after second run) the git repo using a systemd one shot service, and the only problem remaining is, as always, managing the permissions around these directories:
1. my user, =leonardo=, has it's own files and it's what I use to run =nixos-rebuild= from.
2. the =git= user, will own the permissions to the git repositories directory
3. the =cgit= user, will be responsible to run the cgit server.
4. the =nginx= user, is responsible to run the nginx instance and respond to requests.
thus, I devised the following structure:
- =/server/blog= is where the hugo-generated files are going to be. the =nginx= user must be able to read it, and =git= must be able to write to it.
- =/server/git-repos= is where the git repositories will be. the =cgit= user must be able to read all of it's directories, and =git= user must be able to read and write to it.
it seems to suffice to set =git= as the owner of both of these directories, and give all users permission to read and execute files. to implement them, I used =systemd.tmpfile.rules=. I know, there's =tmp= in their name, but rest assured, you can use them to create permanent files setting the correct permissions if you don't give them an age parameter:
#+begin_src nix
users.users.git = {
description = "git user";
isNormalUser = true;
home = git-repo-path;
};
systemd.tmpfiles.rules = [
"d ${blog-public-path} 0755 git users -"
"d ${git-repo-path} 0755 git users -"
];
#+end_src
after figuring this stuff out, the rest is pretty much text book nixos. we set up cgit with ~scanPath = git-repo-path~, with a hook using =pandoc= to generate the about pages using org files correctly:
#+begin_src nix
services.cgit.santi = let
org2html = pkgs.writeShellScript "org2md" ''
${pkgs.pandoc}/bin/pandoc \
--from org \
--to html5 \
--sandbox=true \
--html-q-tags \
--ascii \
--standalone \
--wrap=auto \
--embed-resources \
-M document-css=false
'';
in {
enable = true;
scanPath = git-repo-path;
nginx.virtualHost = "git.santi.net.br";
settings = {
readme = ":README.org";
root-title = "index";
root-desc = "public repositories for santi.net.br";
about-filter = toString org2html;
source-filter = "${pkgs.cgit}/lib/cgit/filters/syntax-highlighting.py";
enable-git-config = true;
enable-html-cache = false;
enable-blame = true;
enable-log-linecount = true;
enable-index-links = true;
enable-index-owner = false;
enable-commit-graph = true;
remove-suffix = true;
};
};
#+end_src
while the following snippet sets up a systemd one shot service to initialize the path to the blog's public files (ran with the =git= user):
#+begin_src nix
systemd.services."blog-prepare-git-repo" = {
wantedBy = [ "multi-user.target" ];
path = [
pkgs.git
];
script = ''
set -ex
cd ${git-repo-path}
chmod +rX ${blog-public-path}
test -e blog || git init --bare blog
ln -nsf ${post-receive} blog/hooks/post-receive
'';
serviceConfig = {
Kind = "one-shot";
User = "git";
};
};
#+end_src
where the =post-receive= hook is very similar to the one Andreas used in his post:
#+begin_src nix
post-receive = pkgs.writeShellScript "post-receive" ''
export PATH=${env}/bin
set -ex
GIT_DIR=$(${pkgs.git}/bin/git rev-parse --git-dir 2>/dev/null)
if [ -z "$GIT_DIR" ]; then
echo >&2 "fatal: post-receive: GIT_DIR not set"
exit 1
fi
TMPDIR=$(mktemp -d)
function cleanup() {
rm -rf "$TMPDIR"
}
trap cleanup EXIT
${pkgs.git}/bin/git clone "$GIT_DIR" "$TMPDIR"
unset GIT_DIR
cd "$TMPDIR"
${pkgs.hugo}/bin/hugo --destination ${blog-public-path}
'';
#+end_src
after running it the first time, I went ahead and statefully copied the git repo from github to the pi, in order to not lose the history, but other than that it should be fine.
*** next steps
sadly, I haven't got the time to actually setup email hosting. currently, I read my email through [[https://djcbsoftware.nl/code/mu/mu4e.html][mu4e]], using mu as a local maildir indexer and searcher. what I'd need is to host a server to receive and send email. receiving doesn't seem to have many difficulties, as it's just a normal listener, but sending apparently is a huge problem, as there seem to be a lot of measures need to be taken in order for your email to actually be delivered and not be flagged as spam.
besides having to setup a reverse DNS lookups, you also need to mess with SPF, DMARC and DKIM, which are scary looking acronyms for boring business authentication stuff. moreover, your ip might be blacklisted, or have low reputation (what does that even mean?), and to top it off it seems like my router's port 25 is blocked forever so I'd also need to configure cloudflare tunnels for that, most likely. I'm currently avoiding all of it, but I intend to look into them in the near future.
I've been meaning to experiment with [[https://gitlab.com/simple-nixos-mailserver/nixos-mailserver][nixos simple mailserver]]'s setup for a while now, but it is an "all in one" solution, and I think it might be trying to do much more than what I'm currently trying to achieve. if anyone has tinkered with it, I'd love to know more about it.
** where flakes fall off: an eval cache tale :nix:flakes:
:PROPERTIES:
:EXPORT_FILE_NAME: where_flakes_fall_off
:END:
#+BEGIN_COMMENT
useful slack threads:
- ruff.sh is slow: https://mixrank.slack.com/archives/C07JBGABHKL/p1733410175630389
- performance regression on nix build ./src/mixrank/rust#rust https://mixrank.slack.com/archives/C060JM3MP7Z/p1732321672241059
- nix evaluation caching rundown: https://mixrank.slack.com/archives/C07JBGABHKL/p1733410175630389
- nix forall implementation thread: https://mixrank.slack.com/archives/C07JBGABHKL/p1734114894668799
#+END_COMMENT
nix flakes are great. they provide a fairly integrated experience of both standardizing structured attributes, like ~packages~, ~devShells~, ~nixosConfigurations~, while also making it fairly easy to integrate other flakes as inputs, and do provide a much better interface to the "new style" nix commands than the old ones did. they also provide a way to automatically update hashes for external inputs, superseding the ~nix-channel~ approach of software distribution, which was [[https://samuel.dionne-riel.com/blog/2024/05/07/its-not-flakes-vs-channels.html][very bad]] and unintuitive to beginners.
as the start of the story, there's one main killer feature over normal non-flake nix commands that's really valuable, that of [[https://www.tweag.io/blog/2020-06-25-eval-cache/][evaluation caching]]. the premise is pretty simple: some nix expressions are fairly complicated and take a long time to evaluate, so if we're sure none of the inputs for the expression have changed, we could just avoid re-evaluating the whole expression altogether by substituting in the final result (usually a derivation path) from a cache.
there are two main ingredients in the flakes soup that enable this feature:
1. that they prohibit "impure" nix; all nix expressions that fetch some value from the outside at runtime (eg. ~fetchurl~, ~fetchgit~) must do so by providing a hash of some sort (commonly refered as "locking"). they also prohibit those that directly read off of the runtime (eg. ~nixPath~, ~getEnv~) as it may cause unreproducibility with no code changes.
2. that they have predefined [[https://nixos.wiki/wiki/Flakes][types and schemas]] for every possible flake attribute path, such that you can be sure which commands can and cannot be cached based on which paths they access.
the theory states that if none of the inputs for a given pure expression have changed, then the output must be byte to byte equal every time[fn:reproducibility]. the way flakes implement it is by a ~.sqlite~ file that links all inputs content hash, mashed together with the specific path used to access the flake, poiting directly to the ~outPath~ of the derivation tucked in with some metadata. the mental model for this then is that when you invoke ~nix build ./path#attr~ (or other cacheable command), it will calculate the input content hash, try to fetch it from an sqlite file, and if it matches, great, you just skipped multiple seconds of your startup time.
as it so happens, I did have in hands a derivation that I wanted to evaluate very rapidly, that took about 5s seconds to evaluate, and that I thought flakes could very well be a solution to. after rewriting the expression and benchmarking, the time to enter to build it went from 5s to 300ms. that's a dandy improvement! case closed. nothing more to complain. right? right? well... ergh.
*** the pitfalls
in order to understand the main pain points, let's examine from the ground up what ~nix build~ evaluation caching looks like, through the following directory structure:
#+begin_src sh
$ ~ :: mkdir my-repo
$ ~ :: cd my-repo
$ ~/my-repo :: git init
[...]
$ ~/my-repo :: mkdir a/b/c -p
$ ~/my-repo :: cd a/b/c && nix flake init
wrote: "home/leonardo/my-repo/a/b/c/flake.nix"
$ ~/my-repo/a/b/c ::
#+end_src
the default flake created from ~nix flake init~ contains the ~nixpkgs.hello~ derivation, so it's perfect to our tests:
#+begin_src sh
$ ~/my-repo :: nix build ./a/b/c#default
error: path '/nix/store/0ccnxa25whszw7mgbgyzdm4nqc0zwnm8-source/a/b/c/flake.nix' does not exist
#+end_src
huh, file doesn't exist? what's that path? it turns out that when no fetcher is specified, ~./a/b/c#default~ will be interpreted as ~git+file:./a/b/c#default~, ie. it's assumed to be a git repository. as such, nix will first do a pseudo ~git clone --depth 1~ to the nix store, with the contents taken from the *working area* and then execute from there, but the generated path (~/nix/store/0ccnxa25whszw7mgbgyzdm4nqc0zwnm8-source~) doesn't contain our ~flake.nix~ file, as it hasn't been included in the git tree yet.
one may also notice an additional "quirk": the copied directory is not =~/my-repo/a/b/c= but =~/my-repo=. as it turns out, my previous statement was misleading, as ~./a/b/c#default~ is actually interpreted as ~git+file:.?dir=a/b/c#default~. this means that the whole =~/my-repo= git directory is copied to the store, not only the subdirectory which contains the flake. running ~git add a/b/c/flake.nix~ before calling ~nix build~ should do the trick.
#+begin_src sh
$ ~/my-repo :: time nix build ./a/b/c#default
warning: Git tree '/home/leonardo/my-repo' is dirty
warning: creating lock file '/home/leonardo/my-repo/a/b/c/flake.lock':
• Added input 'nixpkgs':
'github:nixos/nixpkgs/b024ced1aac25639f8ca8fdfc2f8c4fbd66c48ef?narHash=sha256-fusHbZCyv126cyArUwwKrLdCkgVAIaa/fQJYFlCEqiU%3D' (2025-04-17)
warning: Git tree '/home/leonardo/my-repo' is dirty
real 0m0,881s
user 0m0,460s
sys 0m0,277s
#+end_src
now, let's get this eval cache party started!!!
#+begin_src sh
$ ~/my-repo :: time nix build ./a/b/c
warning: Git tree '/home/leonardo/my-repo' is dirty
real 0m0,889s
user 0m0,461s
sys 0m0,284s
$ ~/my-repo :: time nix build ./a/b/c
warning: Git tree '/home/leonardo/my-repo' is dirty
real 0m0,896s
user 0m0,457s
sys 0m0,284s
$ ~/my-repo :: time nix build ./a/b/c
warning: Git tree '/home/leonardo/my-repo' is dirty
real 0m0,901s
user 0m0,469s
sys 0m0,288s
#+end_src
huh, 900ms is way too high for a cache hit, and it does not seem to go lower than that. this is because, ~nix build~ will always copy the directory to the store when the git tree is dirty, which forces it to evaluate everytime. I cannot find any explanation nor discussion about this topic, but one plausible explanation would be that they're utilizing the git HEAD commit itself to check whether or not it's already been copied to the store, and if the tree is dirty then there's no commit to look at. this should also be the reason why it is complaining so much about ~Git tree '/home/leonardo/my-repo' is dirty~. invoking ~nix build~ with ~-vvv~ give us a better view:
#+begin_src sh
$ ~/my-repo :: time nix build ./a/b/c#default -vvv
evaluating file '<nix/derivation-internal.nix>'
evaluating derivation 'git+file:///home/leonardo/my-repo?dir=a/b/c#default'...
warning: Git tree '/home/leonardo/my-repo' is dirty
source path '/home/leonardo/my-repo/home/leonardo/my-repo/' is uncacheable
copying '/home/leonardo/my-repo/home/leonardo/my-repo/' to the store...
performing daemon worker op: 7
acquiring write lock on '/nix/var/nix/temproots/9659'
performing daemon worker op: 26
got tree '/nix/store/0r5c4j5yww8mljggyy3f97glppgnc2pk-source' from 'git+file:///home/leonardo/my-repo?dir=a/b/c'
evaluating file '/nix/store/0r5c4j5yww8mljggyy3f97glppgnc2pk-source/a/b/c/flake.nix'
(...)
#+end_src
not only do we see the whole copy going on behind our backs together with all file evaluations, we can also see the following worrying line: ~source path '/home/leonardo/my-repo/home/leonardo/my-repo/' is uncacheable~ (I'll assume the duplicate path is just an issue with the debug print). yep, *flakes can only cache evaluations on clean git trees*.
#+begin_src sh
$ ~/my-repo :: git add .
$ ~/my-repo :: git commit -m 'initial commit'
[master (root-commit) 0a8327d] initial commit
3 files changed, 43 insertions(+)
create mode 100644 a/b/c/flake.lock
create mode 100644 a/b/c/flake.nix
create mode 120000 result
$ ~/my-repo :: time nix build ./a/b/c
real 0m0,948s
user 0m0,527s
sys 0m0,269s
$ ~/my-repo :: time nix build ./a/b/c
real 0m0,103s
user 0m0,051s
sys 0m0,026s
#+end_src
finally we see the cache hit. if we try to take a look into what's going on - this time with ~-vvvv~, we see:
#+begin_src sh :hl_lines 6
$ ~/my-repo :: time nix build ./a/b/c -vvvv
evaluating file '<nix/derivation-internal.nix>'
evaluating derivation 'git+file:///home/leonardo/my-repo?dir=a/b/c#packages.x86_64-linux.default'...
using cache entry 'gitRevCount:{"rev":"0a8327deeee66e66f6385f7d134195620649f648"}' -> '{"revCount":1}'
using cache entry 'gitLastModified:{"rev":"0a8327deeee66e66f6385f7d134195620649f648"}' -> '{"lastModified":1745187032}'
using cache entry 'fetchToStore:{"fingerprint":"0a8327deeee66e66f6385f7d134195620649f648","method":"nar","name":"source","path":"/","store":"/nix/store"}' -> '{"storePath":"44yqsx1xq89f6ap49h7a7jp483229y3l-source"}'
performing daemon worker op: 11
acquiring write lock on '/nix/var/nix/temproots/11438'
performing daemon worker op: 1
using cache entry 'fetchToStore:{"fingerprint":"0a8327deeee66e66f6385f7d134195620649f648","method":"nar","name":"source","path":"/","store":"/nix/store"}' -> 'null', '/nix/store/44yqsx1xq89f6ap49h7a7jp483229y3l-source'
store path cache hit for '/home/leonardo/my-repo/home/leonardo/my-repo/'
performing daemon worker op: 26
got tree '/nix/store/44yqsx1xq89f6ap49h7a7jp483229y3l-source' from 'git+file:///home/leonardo/my-repo?dir=a/b/c&ref=refs/heads/master&rev=0a8327deeee66e66f6385f7d134195620649f648'
#+end_src
indeed, my theory seems to be correct: the ~fetchToStore~ cache hit seems to be using a ~fingerprint: 0a8327deeee66e66f6385f7d134195620649f648~ that directly references the current commit:
#+begin_src sh
$ ~/my-repo :: git log --pretty=oneline
0a8327deeee66e66f6385f7d134195620649f648 (HEAD -> master) initial commit
#+end_src
moreover, this state is extremely fragile. if we try to add a new file to the directory, even if it's not read from the nix expression, it will invalidate the cache, because it will generate a different store path, and the git tree isn't clean anymore.
#+begin_src sh
$ ~/my-repo :: touch newfile
$ ~/my-repo :: git add newfile
$ ~/my-repo :: time nix build ./a/b/c
warning: Git tree '/home/leonardo/my-repo' is dirty
real 0m0,885s
user 0m0,465s
sys 0m0,256s
#+end_src
in my own example, the derivation that I was examining initially is one in a subdirectory inside a multi gigabyte monolithic git repository, so one can only imagine how fast copying it to the store on almost every invocation really is. benchmarking it, it seems that a whole second is spent only on copying to the store, and actually evaluating the expression took around 3.5s.
but this is very disappointing. most of the time when working on a repository, you'll have uncommited source code, and so you'll most likely never follow the fastest cache codepath. hell, you're most likely rarely hitting the cache as it requires you to have no changes to the whole repository since the last time you ran it.
thus, we can summarize the two main issues with flakes evaluation cache as:
1. *lack of glanularity of inputs*, where the some inputs selected to be tracked to invalidate the cache are irrelevant to the nix expression itself.
2. *copying the repository to the store*, as the local repository must be copied to the store before starting the evaluation, adding a lot of time when the repositories are sizable enough.
*** alternatives
I'm not the first person to have suffered with these issues. the [[https://github.com/NixOS/nix/pull/6530][lazy trees pull request]] has been on going since 2022, trying to avoid unncessary copies to the store when the local repository store path isn't read from during evaluation. I don't know the current state of its implementation but the changes are so big that it's been spliced into multiple smaller PRs, so I do not know when this issue is getting fixed. it also does not try to tackle the lack of glanularity of cache invalidation, so I had to look for other solutions.
the [[https://devenv.sh/][devenv.sh]] team had noticed that flake eval cache was unreliable and [[https://devenv.sh/blog/2024/10/03/devenv-13-instant-developer-environments-with-nix-caching/][hand rolled their own]]. the solution to the copies is actually to just not do it, by using ~nix-build~ instead of ~nix build~, but, in order to get better glanularity, you need to track yourself which files are read from during evaluation, and implement and invalidate the cache yourself.
they do it by utilizing the same approach as the [[https://github.com/nix-community/lorri][lorri tool]]: leveraging nix's internal logs, whereby invoking ~nix-build~ with ~-vv --log-format internal-json~ provides you a very barebones look into exactly which files and directories are read from during evaluation, which form the basis of a much nicer and more glanular cache.
and while that implementation does work decently, it is very cumbersome to maintain, as any changes to the internal log format mean that the implementation breaks - one can only imagine the stability /that/. another issue with this approach is that it cannot track reads done from outside of the main nix runtime; in particular, it cannot track files read from [[https://jade.fyi/blog/nix-evaluation-blocking/][IFD]]s, which is a downside given that unfortunately my original expression did contain an IFD.
the drive for a better integration with the nix evaluation API was such that they decided to switch to [[https://devenv.sh/blog/2024/10/22/devenv-is-switching-its-nix-implementation-to-tvix/][tvix]], where the runtime is designed with observability in mind and the API access with rust is much smoother[fn:2]. to the best of my knowledge, tvix is not stable enough to be used even in development, as there are some missing implementation details, and so I cannot ascertain how far into the switch to the new implementation they are.
there's one additional quality of the tvix evaluator that I think is worth mentioning. given that it is implemented through a bytecode VM, a smart enough optimizing compiler should be able to inline most function calls in a expression, given they all have concrete values in their hand - the overall expression should just evaluate to a derivation afterall. it seems that the devenv.sh team is heading towards this direction, by trying to leverage some mixed file-glanular bytecode caching to not invalidate the whole cache when single files change. I recon this could very well be the best approach for a change resistant, very fast evaluator of nix.
*** nix-forall's eval cache
in the meantime, for the last couple of months I've been developing a Rust wrapper around the FFI for the recently stabilized nix C API, called [[https://github.com/o-santi/nix-forall][nix-forall]], and decided to give my own twist into solving this problem. instead of relying on the internal logs - which aren't even available from the nix C API (/yet/?) - I decided to try to use the linux ~ptrace~ API! what could go wrong tracing *all* system calls looking for those opening file descriptors?
I chose ~ptrace~ over ~fanotify~ and ~inotify~ because I did not want to have a single root repository to trace, as I did want to support non-flakes alternatives, and there isn't a good way to guess what the root repository might be without a ~flake.nix~, other than ~/~ which the user might not have privileges to reading.
using ~ptrace~ meant I'd need to track all system calls, and filter for those that open file descriptors as well as those that read from them, which wasn't easy. ~ptrace~ also meant that I'd need to ~fork~ the original process, as I wasn't able to trace a process's thread from the parent itself[fn:3]. in order to reduce the number of system calls to track, one can use [[https://man7.org/linux/man-pages/man2/seccomp.2.html][libseccomp]] to raise ~PTRACE_EVENT_SECCOMP~ signals at specific syscalls, as with pure ~ptrace~ one must stop at every single syscall and ignore those unrelated, paying a big price on context switches between the kernel, the tracee and the tracer process every time[fn:seccomp].
in the end, my implementation ended up with a hefty 10% performance tax on the slow codepath (though my benchmarks are not of great quality at the moment), as it paid both the ~fork()~ and the syscall monitoring additional costs, but it was able to severely limit the ammount of files that were needed to track, while also including in those files read from the IFD.
however, just as the slow code path got slower, the fast path got even faster: the unoptimized build written with almost no optimizations takes 5ms instead of the old 300ms, a whopping 60x improvement, and it *persists through writes to unrelated files*, something that flakes never even dreamed of doing. this is to be expected, as there's basically no heavy machinery needed for a cache hit, all it does is hash the source file and query for any cached results for the specific attribute path. if there are any, it asserts that all relevant files' hashes haven't changed, and if so, it returns the cached output directly.
this is by no means a definitive solution. the current implementation is very barebones, and does not even support flakes. as the C API itself is fairly unstable and not feature complete, it [[https://github.com/NixOS/nix/pull/12877][just recently got support for handling and fetching flake inputs]], and so in the mean time I tried to design my evaluation to handle flakes as well as non-flakes. the best I came up with was to take the accessor path (eg. ~packages.x86_64-linux.default~) into consideration when inserting things into the cache, much like flakes does, and using a modified version of [[https://github.com/edolstra/flake-compat][eelco's flake-compat]] to access the flake through a non-flake interface, as the original package I wanted to interact with was offered through a flake interface afterall.
it also countains implementation bugs in the main loop when multithreading is involved (solely due to my poor skills with it), as it isn't exactly trivial to track the state of multiple processes at once when the ~ptrace~ API has a bunch of implicitly tracked state. for example, one can issue ~PTRACE_SYSCALL~ requests to stop the tracee at the next syscall, raising ~PTRACE_EVENT_SYSCALL~ signals, but whether it refers to pre-execution or post execution state of said syscall is tracked entirely implicitly, and you're expected to track yourself. these kinds of issues are bound to be ironed out as the time passes and I understand what I'm doing wrong, but it isn't exactly easy to find much information about how handling it properly looks like so usually I can only refer to debugger source code implementations.
still, I *do not* think my approach is going to be the future of evaluation cache, due to it's sheer complexity and difficulty to maintain. I reckon an approach based on internal compiler observability is going to win long term, and it doesn't need necessarily to be through tvix. the nix team has mentioned[fn:5] there's some interest in integrating an OpenTelemetry approach to nix internal logs, improving observability and allowing for it to be operated more cleanly from the outside.
either way, I would really like for there to be some attention towards these painpoints of using flakes, as I believe a lot of people will end up with the same problem as mine.
[fn:reproducibility] this isn't exactly true. there are ways to create non-reproducible builds even if the source inputs haven't changed, but the general consensus in the community seems to be that you need to go out of your way to cause unreproducibility in your derivation. [[https://luj.fr/blog/is-nixos-truly-reproducible.html][this great blog post]] by Julien Malka reports his research to measure how well nixpkgs fares against reproducible builds, citing that the main ways that differences occur is when some kind of environment data is embedded into the build script itself.
[fn:1] the best source I could find for the reasoning behind this decision is [[https://hackmd.io/@nix-ux/Hkvf16Xw5?print-pdf#/][this old nix UX team pdf]] questioning the decision, saying that it is lazily inherited from the ~fetchGit~ implementation. the only argument for this behavior seems to be that it implements a way to ignore files in ~.gitignore~, but even so it seems like a hacky implementation at best.
[fn:2] I'm selling tvix a bit short here, their implementation is awesome. they offer a content addressed store that [[https://blog.replit.com/tvix-store][reduces storage space by as much as 80%]], a byte code optimizing compiler and VM runtime, instead of the slow AST fold nix does, and most incredibly, an asynchronous evaluator that treats IFDs as first class objects that can be executed in parallel with nix code.
[fn:3] I'm saying this based on this [[https://yarchive.net/comp/linux/ptrace_self_attach.html2006][2006 email thread]] from Linus saying that they added checks to avoid ~ptrace~'ing when both the parent and the child are in the same thread group, which means that I cannot simply invoke a thread to dispatch the job. I cannot find any more information about this, but trying it locally indeed failed.
[fn:seccomp] [[https://www.alfonsobeato.net/c/filter-and-modify-system-calls-with-seccomp-and-ptrace/][this amazing blog post]] by Alfonso Beato gives a very in depth example on how to use the ~ptrace~ + ~seccomp~ power duo.
[fn:5] this has been mentioned as [[https://discourse.nixos.org/t/2024-10-16-nix-team-meeting-minutes-187/54835][a proceeding of one of the nix team meetings]], referencing this [[https://github.com/NixOS/nix/pull/7247][2022 issue that hasn't seen much activity]].
* COMMENT Local Variables :ARCHIVE:
# Local Variables:
# eval: (org-hugo-auto-export-mode)
# End:
|