this post was submitted on 21 Aug 2023
41 points (100.0% liked)

Programming

17326 readers
170 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



founded 1 year ago
MODERATORS
 

For a current project, I’ve been struggling with my language files. They’re all JSON files, and will always fallback to English if translations aren’t available.

My problem is that when a new key is required, I use my english file by default. This leads to situations where my client wants to translate new keys to other languages, and I have to spend time looking at all files, figuring out which keys i haven’t added there.

Essentially I want to get to a point where I can give all the translation files to my client, and he returns them with the translated content.

What do you guys use for managing this? And how would you solve the situation i’ve found myself in.

all 31 comments
sorted by: hot top controversial new old
[–] properlypurple@lemmy.blahaj.zone 18 points 1 year ago (1 children)

Is there a specific reason your translations are in json files, and not using a system such as gettext? IMO those .po files are much easier to translate for non-tech people, especially with many third party apps available for editing them. There are libraries available for many languages, so you don't have to do anything manually in most cases.

[–] BetaSalmon@lemmy.world 10 points 1 year ago

No real reason for using json files, other than it's a web app build in JavaScript, so json was kinda the default for it. Definitely open to changing to whatever makes it more convenient to manage.

[–] o11c@programming.dev 16 points 1 year ago

Stop reinventing the wheel.

Major translation systems like gettext (especially the GNU variant) have decades of tooling built up for "merging" and all sorts of other operations.

Even if you don't want to use their binary format at runtime, their tooling is still worth it.

[–] Durotar@lemmy.ml 11 points 1 year ago (2 children)

I have to spend time looking at all files, figuring out which keys i haven’t added there

It sounds like a simple bash script could do that for you. Take keys from the English language file, compare against other language files, find missing keys for each language.

[–] BetaSalmon@lemmy.world 5 points 1 year ago

Yep, due to another comment (https://lemmy.world/comment/2637663) I started working on putting something simple together.

[–] starman@programming.dev 4 points 1 year ago

Or any diff app

[–] DaleGribble88@programming.dev 7 points 1 year ago (1 children)

A trick the indie game development community has used for years is just a simple excel file. CSVs are the easiest to work with if you are unfamiliar. First column is the ID of the text that you can reference in code, and each column is a translation of that text. Get the initial translation in place, typically English, then email the excel file to anyone who ask to create as fan translation. Also, unless you are translating the Illiad, the extra memory use is negligible.

[–] Faresh@lemmy.ml 4 points 1 year ago* (last edited 1 year ago) (1 children)

CSVs are the easiest to work with if you are unfamiliar

A disadvantage with this is, if you ever want to collaborate with someone else using version control, it will increase the amount of merge conflicts, because multiple strings will be on the same line.

[–] DaleGribble88@programming.dev 4 points 1 year ago

True, but if there is a large project with many different collaborators, they'd need a more verbose system than a CSV file anyway. (And likely a more senior developer who knows how to handle situations like this.) My point is that excel files, and CSVs in particular, are easy to parse, easy to check for completeness, and easy to distribute to less technical people. Basically, while not optimal, they will just work.

[–] starman@programming.dev 6 points 1 year ago* (last edited 1 year ago) (1 children)

You could write a script that updates other files based on english file. It could put someting like

"key": "UNTRANSLATED"

to missing fields.

[–] coloredgrayscale@programming.dev 3 points 1 year ago (1 children)

Usually a translation system might return the key value if the translation is missing. By translating with "untranslated" as a default you'd get just that text filled as fallback.

Unless you reinvent the wheel for lookup and can just ignore your magic value, or put an if on every value lookup.

Might be a risk there.

[–] starman@programming.dev 2 points 1 year ago (1 children)

You're right, I didn't think about it

[–] Vorpal@programming.dev 1 points 1 year ago

Your idea will work with minor changes (if comments are supported in your file format). At work our tooling create entries like 123="English text" // UNTRANSLATED. Obviously not quite the same format, but it should be adaptable to any format that supports comments.

[–] Almamu@lemmy.world 4 points 1 year ago

Part of my CI/CD is a script that makes builds fail if any of my languages are missing keys, it takes English as the desired result and checks all the keys to ensure they all exist and prints a report of the ones that are missing in csv format so I can send them to translators. I generally run this manually before committing, but having it in the CI/CD helps preventing these errors from making it into production...

[–] koreth@lemm.ee 4 points 1 year ago* (last edited 1 year ago)

Here's some Apache-licensed code that addresses this exact problem. The language files are in CSV format and get turned into JS files as a build step. It prints warnings for strings that are missing from other languages. In dev environments, there's middleware that watches for edits to the CSV files and rebuilds the JS files.

[–] wito@lemmy.techtailors.net 3 points 1 year ago* (last edited 1 year ago) (2 children)

If you are using TypeScript it's quite easy to create a system where the type system will enforce the existence of all translations. I think it should be possible to create a similar solution for other languages as well.

For example:

const enTranslations = { MENU: '' };

const plTranslations: typeof enTranslations = { MENU: '' } as const;

const t = (key: keyof typeof enTranslations) => get language() == 'pl' ? plTranslations[key] : enTranslations[key];

Missing keys will fail compilation. If you want to skip check you can always use //@ts-ignore

Additionally the type system will enforce only valid translation keys so you won't be able to make a typo it forget to add English translation.

[–] BetaSalmon@lemmy.world 2 points 1 year ago (3 children)

I think I actually just want a system, which will take my English file as the default, and add the missing keys to the rest of the language files.

[–] towerful@programming.dev 3 points 1 year ago* (last edited 1 year ago) (1 children)

Perhaps, the "fallback to English translations file at runtime" is obscuring errors.
Might be worth redefining the system to throw an error when a translation key in the chosen language isn't found. Even if that's only done in Dev, whereas the fallback happens in prod.
This will ensure a translation file has all the keys, even if the values are still default.

Some tooling for you to easily add a new key, and have it also add that to all language files as [word] or something. So, the English word is still used, but the square brackets shows that it's untranslated.
Maybe some tooling to find all values that have the [], to generate a translation to-do list.
Probably a tool to create a new translation file as well, which would duplicate the English file, but apply the "this is not translated" pattern to all the values

[–] gentooer@programming.dev 1 points 1 year ago (1 children)

Are warnings no thing anymore?

[–] coloredgrayscale@programming.dev 1 points 1 year ago (1 children)
[–] gentooer@programming.dev 1 points 1 year ago

Jenkins didn't at my last job.

[–] pinchcramp@lemmy.dbzer0.com 1 points 1 year ago (1 children)

Obviously I don't know your codebase but couldn't you do something like the following?

function loadTranslations(locale) {
  const fallbackTranslations = require("/i18n/en.json");
  const translations = require(`/i18n/${locale}.json`);
  return {
     ...fallbackTranslations,
    ...translations
  };
}
[–] BetaSalmon@lemmy.world 1 points 1 year ago* (last edited 1 year ago) (1 children)

That won't work quite well because a lot of it is nested. But shouldn't be too hard to account for that in a couple of lines of code.

[–] pinchcramp@lemmy.dbzer0.com 2 points 1 year ago* (last edited 1 year ago) (1 children)

Yeah, I don't know the exact structure of your translation files but a deep merge of your fallback files and the requested locale file should be enough.

[–] BetaSalmon@lemmy.world 1 points 1 year ago

Sounds like that will be indeed the easiest and quickest solution for this project.

[–] wito@lemmy.techtailors.net 1 points 1 year ago

Quite a lot of IDEs will key you just click "add missing properties" action on the translation object to create a language file.

[–] wito@lemmy.techtailors.net 1 points 1 year ago* (last edited 1 year ago) (1 children)

It's also quite ready to transform this file to JSON and send it to translators through any service that supports his format.

[–] Vorpal@programming.dev 2 points 1 year ago

There are existing approaches: GNU gettext and Mozilla fluent comes to mind. I would try to use one of those. I understand that Mozilla Fluent has good support for the Web (unsurprisingly).

[–] 314xel@lemmy.world 1 points 1 year ago* (last edited 1 year ago) (1 children)

We return / display the translation key name if no translation is available for that language, including for the default (English).

Also, on dev / test environments we can enable a config (.env) setting to append the text " [UNTRANSLATED]" to every value that doesn't have a translation, as they're easier to spot in the website / interface.

I'm talking about a PHP /Laravel project so it was easy to override the default translation engine behaviour.

[–] BetaSalmon@lemmy.world 1 points 1 year ago

I suppose my main issue is with the actual language files. In terms of translation we also default to English. It's just a tedious job to remember to add they new key to every language file, which is a problem I'm facing now.

Due to so many new features, the non-English language files are quite outdated.