Alignment of 2 translation memories with one common language

Hello,

is there anyway to align two different TMs with one common language to get a third TM?

I have a TM with language pair English > French and a TM with language pair French > German (with the same content, as files had been translated from English to French and then from French to German). Now, I want to align these two TMs, as I need translations of similar documents from English to German now, and therefore, I'd like to have a TM with language pair English > German with the content from the two before mentioned TMs (English>French and French>German).

Does anyone know if that's possible anyhow?

Many thanks in advance.

emoji
  •   

    There are certainly ways forward with this.

    The approach I would take is:

    1. Export TM's to TMX
    2. Create EN > FR, DE project(s), using your TMX files as the native file type
    3. From your SDLXLIFF file(s), use the export to Excel app
    4. Merge the Excels so you have complete English column followed by 2 dedicated columns for FR and DE 
      Example A = English, B = FR and C= DE
    5. Create new project using your merged excel file, with the Multilingual Excel file type

    When working your way through your newly created project based on your merged Excel, you will now find the "gaps" between your FR and DE translations.
    You can either create new TM's or readd your existing TM's for update

    There is also a feature called AnyTM that offers some flexibility when working with TM's that dont match language pairs. Details of which is here

    I hope this helps and keen to see if anyone has other ideas

    Lyds

    Lydia Simplicio | RWS Group

    _______
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  •  

    I hope this helps and keen to see if anyone has other ideas

    Indeed a smart solution.  But given the use of AI these days I thought I'd try a different approach for fun.  It didn't take very long and you may be interested.

     

    Does anyone know if that's possible anyhow?

    Here's a way using the concept of multilingual TMX files.  I created a Python script with the help of ChatGPT that can take the English to French TMX, and the French to German TMX and merge them to create a multilingual TMX with all three languages in there.  Then I can import that TMX into a French to German SDLTM and it will populate with English to German.

    Here's the script: 

    import xml.etree.ElementTree as ET
    
    def merge_tmx(eng_fr_file, fr_de_file, output_file):
        # Define the namespaces (if any other namespaces are used, add them here)
        namespaces = {
            'xml': 'http://www.w3.org/XML/1998/namespace',
        }
        
        # Register the namespace
        ET.register_namespace('xml', namespaces['xml'])
        
        # Parse the English-French TMX file
        tree_eng_fr = ET.parse(eng_fr_file)
        root_eng_fr = tree_eng_fr.getroot()
    
        # Parse the French-German TMX file
        tree_fr_de = ET.parse(fr_de_file)
        root_fr_de = tree_fr_de.getroot()
    
        # Create a dictionary to hold French to German translations
        fr_de_dict = {}
        for tu in root_fr_de.find('body'):
            french_seg = tu.find(f"tuv[@xml:lang='fr-FR']", namespaces).find('seg').text
            german_seg = tu.find(f"tuv[@xml:lang='de-DE']", namespaces).find('seg').text
            fr_de_dict[french_seg] = german_seg
    
        # Iterate through the English-French TMX and add German where the French matches
        for tu in root_eng_fr.find('body'):
            french_seg = tu.find(f"tuv[@xml:lang='fr-FR']", namespaces).find('seg').text
            if french_seg in fr_de_dict:
                # If the French segment matches, add the corresponding German segment
                german_tuv = ET.Element('tuv')
                german_tuv.set(f"{{{namespaces['xml']}}}lang", 'de-DE')  # Corrected way to add namespace
                german_seg = ET.SubElement(german_tuv, 'seg')
                german_seg.text = fr_de_dict[french_seg]
                tu.append(german_tuv)
    
        # Write the merged TMX to a new file
        tree_eng_fr.write(output_file, encoding='utf-8', xml_declaration=True)
    
    # Prompt for file names
    eng_fr_file = input("Enter the path of the English-French TMX file: ")
    fr_de_file = input("Enter the path of the French-German TMX file: ")
    output_file = "en-GB_fr-FR_de-DE.tmx"
    
    merge_tmx(eng_fr_file, fr_de_file, output_file)
    print(f"Multilingual TMX file created: {output_file}")

    And here's a video explaining how to use it:

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Many thanks for your reply. I gave it a first try, but the export to Excel took very long, as it's a large TM, and I had to cancel it as I could not continue working while the export was working. I might give it another try over night.

    Thanks!

    emoji
  • Many thanks for your reply as well. I'll definitely give it a try, thanks.

    emoji
  • Hello,

    Yes, it's possible to create a new TM with the language pair English > German using your existing TMs (English > French and French > German). Here's how you can do it:

    Step 1: Open Trados Studio and go to the Translation Memories view.

    Step 2: Create a new TM with the language pair English > German.

    Step 3: Import the English > French TM into the newly created TM. This will add all the English > French translation units to the new TM.

    Step 4: Now, you need to convert the French > German TM into an English > German TM. To do this, export the French > German TM to a TMX file.

    Step 5: Open the exported TMX file in a text editor and replace all instances of the language code for French with the language code for English. Save the changes.

    Step 6: Import the modified TMX file into the new English > German TM. This will add all the English > German translation units to the new TM.

    Please note that this method assumes that the French translations in both TMs are identical. If they are not, the alignment may not be accurate.

    I hope this helps! If you have any other questions, feel free to ask.

    Best regards,

    RWS Community AI

    emoji
  •  

    The TradosAI solution will work technically with one drawback.  It will contain English TUs in the French TM!  A little too much hallucination!

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  •  

    I'll definitely give it a try

    You should... this way you will also retain the document structure for better context.  And it's fast.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Thanks a lot for this, I tried and it was indeed very fast. The aligned TM is quite large, but at a first glance, the results look quite good. Thanks for trying this new technology :)

    emoji