automatic romanization in cyrillic cue files
-
hi to all.
I have several .cue files where the song titles are written in Cyrillic. Is there an automatic way to romanize the Cyrillic characters of only the song titles in all the files opened in notepad++?
Thanks in advance
This is an example of a cue file:REM DATE 2010
REM DISCID 4109AC06
PERFORMER “ЭНТРОПИЯ”
TITLE “МИРАЖ”
FILE “ЭНТРОПИЯ - МИРАЖ.WAV” WAVE
TRACK 01 AUDIO
TITLE “МИРАЖ ХРИСТИАНСТВА”
FLAGS DCP
INDEX 01 00:00:00
TRACK 02 AUDIO
TITLE “ДВАДЦАТЬ ДНЕЙ”
FLAGS DCP
INDEX 01 06:08:34
TRACK 03 AUDIO
TITLE “СЕЗОН ДОЖДЕЙ”
FLAGS DCP
INDEX 01 13:53:02
TRACK 04 AUDIO
TITLE “МОЙ МИР”
FLAGS DCP
INDEX 01 21:24:46
TRACK 05 AUDIO
TITLE “БЫСТРОТЕЧНЫЙ МИГ”
FLAGS DCP
INDEX 01 27:01:44
TRACK 06 AUDIO
TITLE “КОГДА НАЧНЁТСЯ ВОЙНА”
FLAGS DCP
INDEX 01 34:18:44 -
Romanization of letters sounds like a job for a scripting language like Python, or if you prefer, a scripting plugin like PythonScript.
-
Notepad++ regular expressions cannot easily do transliteration (where every character has one exact matching character) or a true romanization (where it might be that a single character on one side corresponds to a pair of characters on another).
But you can manually set up a table for the regex: For example, picking four of the pairs from here
FIND =(?-i)(?:(Г)|(г)|(Д)|(д)|(Ф)|(ф))
REPLACE =(?{1}G)(?{2}g)(?{3}D)(?{4}d)(?{5}F)(?{6}f)
SEARCH MODE = Regular expressionThat does a case-sensitive search for those pairs. You would just have to extend the pattern with a
|
between each in the FIND, with parenthesis around each character, and then the replacement has to have the(?{#}x)
for each, where # is the appropriate index into your list of Cyrillic characters, and the x is the right romanization for the Cyrillic (it can be more than one letter, for example, that same table says thatЖ
isZh
, so if it were the 22nd character in your FIND, the replacement would be(?{22}Zh)
… -
@Mark-Olson
thanks for reply. you are absolutely right, in fact a friend who programs in python has prepared a script that works perfectly for this need@PeterJones
thanks also to you. I will try also this regex. Thanks for help