- WAY back in the annals of computer history
- the Dragon Book – the definitive book, but very much a classroom book (VERY dry reading)
- SICP – looking at how to program (and thus why programming languages are built the way they are)
- much more recent
- Crafting Interpreters – from the author of Game Programming Patterns
Programming Languages
Hello!
This is the current Lemmy equivalent of https://www.reddit.com/r/ProgrammingLanguages/.
The content and rules are the same here as they are over there. Taken directly from the /r/ProgrammingLanguages overview:
This community is dedicated to the theory, design and implementation of programming languages.
Be nice to each other. Flame wars and rants are not welcomed. Please also put some effort into your post.
This isn't the right place to ask questions such as "What language should I use for X", "what language should I learn", and "what's your favorite language". Such questions should be posted in /c/learn_programming or /c/programming.
This is the right place for posts like the following:
- "Check out this new language I've been working on!"
- "Here's a blog post on how I implemented static type checking into this compiler"
- "I want to write a compiler, where do I start?"
- "How does the Java compiler work? How does it handle forward declarations/imports/targeting multiple platforms/?"
- "How should I test my compiler? How are other compilers and interpreters like gcc, Java, and python tested?"
- "What are the pros/cons of ?"
- "Compare and contrast vs. "
- "Confused about the semantics of this language"
- "Proceedings from PLDI / OOPSLA / ICFP / "
See /r/ProgrammingLanguages for specific examples
Related online communities
- ProgLangDesign.net
- /r/ProgrammingLanguages Discord
- Lamdda the Ultimate
- Language Design Stack Exchange
As someone who's spent way too much time languishing over picking the perfect parsing technique for my own language, I'm actually gonna go against the norm and recommend figuring out the parser later. Instead you should start with building your language's ASTs directly in memory and then from there, either build a backend for converting ASTs to LLVM IR, or what I'd actually do first is just start with an interpreter executing directly on the ASTs. This way you figure out the mechanics of your language, and then when they're well established, you can worry about your syntax, and how to parse it into ASTs. There's a lot you learn about your language by doing it this way that you don't necessarily think about if you just start from the parser. It also let's you see real progress/output sooner which I think is key for staying motivated on these kinds of projects.
When it comes time to actually write the parser, I recommend either just hand crafting a parser directly, or using an existing parser generator tool like gnu bison, etc. I do not recommend trying to write your own parser generator (e.g. LR(k), LALR, LL(k), etc.) unless your language's syntax is particularly simple. Speaking from experience, real languages have many common syntax features we take for granted that are hard to deal with in parser generators. In my case, I spent years bogged down exploring/implementing several state of the art parser algorithms (I'm a fan of generalized parsing, so Earley, GLR, SRNGLR, GLL, etc), and really only recently made any decent progress when I decided to trash them and just hand write a dumb recursive-descent-esque parser. Once things are working, it's pretty easy to go back and swap out the parser if you want something more fancy.
I agree on avoiding on the idea of avoiding having to make your own parser generator, this is precisely what I'm doing and it's hell. I assumed that you probably want to pick up some understanding on how parser differs when it come to writing grammars. As for ease of use and requiring the least understanding, using something like Earley parser is probably the easiest, it would be slower than other parser algorithms, but it could handle ambiguous grammars making it ideal for first timers to learn how to write a programming language.
I just default to recursive descent parsers (with pratt parsing), simple, efficient, great error messages and almighty (CFGs). For quick prototyping I really like to use https://github.com/zesterer/chumsky currently (pratt parsing was just added, need to try that out again).
But writing a parser generator is certainly an interesting academic task.
Very nice, I was basically forking off Python Lark and rewriting it in C language, with some adjustments to Earley Parser in an experiment to parallelize the processing in Vulkan Compute.
I've been following the book Crafting Interpreters which makes a language slightly similar to lua but with bracketed blocks and other small changes. This is a stack based interpreter which feels like a great place to start. There's also register based which seems to use more of a word code to bytecode approach but couldn't tell you much else the differences. I'd highly recommend building an interpreter first as this is a virtual machine and you can make your own fake assembly with less consequences. Writing a real native code compiler is a lot harder if you don't have that baseline under your belt, but you do you if you're confident in your knowhow!
I guess the biggest takeaway i've come across is you can make functions without too much headache, but closures are much harder, you need to move local variables off the stack and onto the heap. Not sure how register handles it
I definitely recommends that you start learning about the LL(k), LALR, and perhaps even Earley Parser algorithms. I am assuming you have picked up a little bit on LL(1) parser and some basic lexer, so mastering the parser algorithms are basically the next stop for you.
Once you get the grasp of those things, you are well on your way to designing a programming language.
What do you wish you had known before you set out?
If you plan to create an imperative language: If you want to make everthing an expression make sure you understand all implications.