this post was submitted on 23 Nov 2023
24 points (87.5% liked)

Programming

17366 readers
213 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



founded 1 year ago
MODERATORS
 

Being a foss enthusiast I can configure most of my software in way too many ways. However I noticed that this is not true for most compilers. Which got me thinking: why isn't that the case. In gcc (or your favorite compiler tool) I have a shitload of options about what are errors and warnings and how the code should be compiled and tons of other options. But not on how the code should be interpreted and what the code should look like.

Why can't I simply add a module to a build process to make it [objective oriented | have indentation for brackets | automatically allocate memory | automatically assume types | auto forward-declarate | some other thing that differentiates one language from another]* ? Its so weird that I have a pdf reader that has an option to set the window icon, a mail client that lets me specify regex to search for a mentioned but forgotten attachment and play a game that lets me set my texture picmip resolution. But that the tool (gcc) to build these things has not even got a config file build in. We have build tools around them to supply arguments.

This could look like the following: ( oversimplified )

  1. preprocess
  2. compile
  3. assemble
  4. link

v

  1. add brackets from indentation
  2. preprocess
  3. check if objective oriented constraints are all satisfied
  4. do something else
  5. compile
  6. assemble
  7. run assembly through as an example ai for antivirus scanning
  8. link
  9. run test

There could also be a fork in this process: sending for example the source code both to a compiler and an interpreter to detect edge case behavior while compiling. Or compile with both automatic typing and your defined typing so that when rounding errors are big you can instantly compare with a dynamically typed version of your program. Or the other way around, maybe you want different parts of your code to be handled with different preprocessors.

The build process should be configured per project for things about the input like syntax and per computer for things about the output like optimizations.

There are of course some drawbacks, one being a trust issue where someone pulls in a obscure module to build malicious releases. It probably also is harder to maintain stability when you have to keep in mind that your preprocessor isn't the first to be run. And your compiling process can take a lot longer if you have to go through multiple pre, post or even compilation phases.

If you know such a build tool, or c (: haha :) some obvious reasons that this should not exist, please let me know. Thank you for reading this lenghty post.

Thanks for the comments, based on them I think I can better explain what I want. I would like a language that has got minimal specification so its preprocessor, compiler, assembler and linker are a collection of plugins rather than one chunky program.

So the compiler reads for example a line. void main(int argc, char argv) and then all main body plugins get a event_newline. The function plugin reads this and creates a new object that contains the function main. Then sets an event_functionBody that is caught by other plugin(s) to read the contents of main and return what it has to do.

you are viewing a single comment's thread
view the rest of the comments
[–] spykyvenator@programming.dev 1 points 11 months ago* (last edited 11 months ago) (2 children)

Yes, not sure what you mean by this but its indeed what I'm getting at, our compilers aren't built enough in unix fashion to my liking. gcc handles preprocessing, compilation and linking. but I wouldn't know how to run a second preprocessor after the first one in gcc, just did a quick search apparently gcc -E handles this, but that doesn't seem that intuitive to run gcc -E on all files to some temporary directory, there run some other program on all the code then compile and link. A pipeline would be nicer and I also don't know any tools that can do additional preprocessing.

[–] noli@programming.dev 5 points 11 months ago (2 children)

LLVM is designed in a very modular way and the LLVM IR allows you to specify e.g. if memory management should be manual/garbage collected.

You could make a frontend (design a language) for LLVM that exposes those options through some compiler directives.

In general I'd heavily recommend looking into LLVM's documentation.

[–] jeffhykin@lemm.ee 1 points 11 months ago (1 children)

Wow I knew some about LLVM IR but I had no idea it had high level options like garbage collection.

[–] noli@programming.dev 2 points 11 months ago

Oh yeah, it's actually pretty extensive and expressive. If you're interested in this sort of stuff it's worth checking out the IR language reference a bit. Apparently you can even specify the specific garbage collection strategy on a per-function basis if you want to. They do however specify the following: "Note that LLVM itself does not contain a garbage collector, this functionality is restricted to generating machine code which can interoperate with a collector provided externally" (source: https://llvm.org/docs/LangRef.html#garbage-collector-strategy-names )

If you're interested in this stuff it's definitely fun to work through a part of that language reference document. It's pretty approachable. After going through the first few chapters I had some fun writing some IR manually for some toy programs.

[–] spykyvenator@programming.dev 1 points 11 months ago (1 children)

LLVM really looks like something that I need to look into

[–] jeffhykin@lemm.ee 2 points 11 months ago* (last edited 11 months ago)

LLVM is the engine everything compiles to. The problem is there's no car, it's just the engine lol.

And other than Rust (which uses LLVM) the existing cars are not very configurab--well I mean they're configurable but not at the extreme level of configuration you're talking about.

[–] nodoze313@lemmynsfw.com 1 points 11 months ago (1 children)

Does running lint prior not resolve the issue? Isn't this the entire goal of make, cmake, autotools, etc? Why do you need to run it after? So you can re-process the macros after they are in line? Should just validate the macros before running gcc.

[–] spykyvenator@programming.dev 1 points 11 months ago (1 children)

It is somewhat like running multiple linters and prettifiers but these are hefty tools, the build tool should provide an interface that lets you attach different programs for every little step from code to machine lang

[–] monotremata@kbin.social 1 points 11 months ago* (last edited 11 months ago)

It really sounds like you're describing Make (or LLVM). Is there something you need it to do that those can't handle?