Bram's Dev Blog

home

DSpace 7 - PO transition

17 Dec 2018

DSpace 7 i18n transition to Po Files - current status

Both from my own team as from community members, there has been no objection so far for getting rid of the intermediate message keys, and use the English messages as keys instead.

So right now we have:

This means the challenge is to make a CLI script with grep / awk / sed that:

  1. iterates over the message keys in the .po file
  2. searches the source code for a key
  3. replaces the key with the message in the source code

Step one: make it work for ONE key

Let’s take following key as the test case:

 msgid "404.help"
 msgstr "We can't find the page you're looking for. The page may have been "
 "moved or deleted. You can use the \"Back\" button below to get back to the home page. "

It has:

Tried to Google if anyone has already did this particular challenge before, but didn’t find it, so I guess we’re on our own.

Grep: finding the key

A recursive grep, only looking in the .src folder seems to do the trick. Doing this one directory higher brings up too many results, including the language files themselves.

grep -r "404.help" ./src  
./src/app/pagenotfound/pagenotfound.component.html:  <p>404.help</p>

But after all, I may not need to use grep at all, since sed can also find a pattern recursively.

Sed: replacing the key

I have used this long long ago, so need to brush up my knowledge.

Sed has so many options and was pretty daunting … got lost in the many examples that were not entirely applicable.

In How to replace a string in multiple files in linux command line I learned about repren

Replacing strings with repren

The Repren tool seems to be designed exactly for what I want to do, example:

» ./repren --from 404.help --to "We can't find the page you're looking for. The page may have been moved or deleted. You can use the \"Back\" button below to get back to the home page. " --full --dry-run .
Dry run: No files will be changed
Using 1 patterns:
  '404.help' -> 'We can't find the page you're looking for. The page may have been moved or deleted. You can use the "Back" button below to get back to the home page. '
Found 3 files in: .
- modify: ./another-file.html: 1 matches
- modify: ./intro.html: 1 matches
Read 3 files (23029 chars), found 2 matches (0 skipped due to overlaps)
Dry run: Would have changed 2 files (2 rewritten and 0 renamed)

When executed without the dryrun flag, it makes backup copies of the files to be changed into .orig files. This first try already worked pretty well, however, the single quotes still needed to be escaped.

For multiple replaces, Repren supports pattern files with per line, a string that has to be replaced, a tab character, and the pattern by which it needs to be replaced.

My first pattern file:

404.help	"We can't find the page you're looking for. The page may have been moved or deleted. You can use the \"Back\" button below to get back to the home page. "
404.link.home-page	"Take me to the home page"

Unlike the first individual example, this one does copy the surrounding double quotes as well, so the pattern file can’t contain those double quotes.

404.help	We can't find the page you're looking for. The page may have been moved or deleted. You can use the \"Back\" button below to get back to the home page.
404.link.home-page	Take me to the home page

Right now, I think I’ll just put the single quote escaping into the pattern file itself.

TODO

DSpace 7 Angular

Analyzer.atmire.com work

Productivity

Jekyll http://bram-atmire.github.io/ site

Atmire.com work

Investigate and work on search engine optimization (SEO) for the main atmire.com website. Look into web.dev from google for this (thank you Philip)

Learning just for learning