metadata missing after importing DGT tmx into sdltm

Hello everyone,

I'm writing here because of a problem with the import of DGT tmx (available here https://ec.europa.eu/jrc/en/language-technologies/dgt-translation-memory) into a sdltm. I can import segments but not fields related to them.

I would need the final sdltm to display the fields I can see in the TMX file which show the type of text the translation unit was taken from.

I have tried to perform the import using both "the imported data will be primarily used in mixed scenarios" and "the imported data will be primarily used with presegmented legacy SDL Trados ITD or TTX files" but the result was the same: no TMX field was imported into the sdltm.

Have you got any suggestion/idea on how to import translation units fields from tmx files into sdltm?

Davide

  • Hi ,

    If you import the TMX then you need to make sure that you already have the appropriate fields available in the SDLTM so they can be populated. If you "upgrade" the TMX then these fields will be created during the upgrade process.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi Paul,

    thank you very much for your reply.

    I tried both ways (import with new fields, and upgrade) but the result I get is always a TM with plain translation units without extra fields:

    Just for an extra information, during the import and upgrade tests, I tried both the settings "The imported data will be primarily used in mixed scenarios" and "The imported data will be primarily used with presegmented legacy SDL Trados ITD files and TTX files" but each time I got no segment field in the final TM.

    Let me clarify that the reason I'm trying to do all this is that I would need to have only DGT TM segments which have certain CELEX codes in their fields ( for example "2017R1001").

    I would need to split the whole DGT TM into relatively smaller TMs divided by type of source text (directives, regulations, and so on).

    To do this I thought I could:

    1) filter the raw tmx file (using CELEX numbers) and then import it into an sdltm, or

    2) import the whole tmx file into an sdltm and then apply the filter.

    As to way 1, I managed to filter the tmx file (via Olifant or Xbench, I'm not sure which one I used) but it was not possible to import the filtered tmx file into an sdltm because the tmx file seemed corrupted. The filtered tmx didn't fit with the import process, it seemed the file included some errors.

    So I'm relying on step "2" but, regardless of the fact I import the tmx file or I upgrade it, I don't get the fields that could make the filtering step possible.

  • Hi  ,

    Apologies for the delayed response... I only found time to play with this today. I cannot do this in Studio alone either so I have asked the product team to check this in case I'm missing something. I'll come back to you once I get more feedback.

    In the meantime I did find a workaround but I'm not sure whether you'll have the right tools to do this? The workaround uses Trados Workbench.

    1. Create a new TM in Trados Workbench
    2. Create the Fields you want in File -> Setup
    3. Import the TMX (this retains all the field values if the field names you created match those in the TMX)
    4. Upgrade the Trados TMW in Studio and now you get all the fields

    If you don't have Workbench drop me an email and a link to your exported DGT TMX and I can do this for you - pfilkin@sdl.com

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi Paul (pfilkin),

    I have checked the converted TMs you sent me back and they are perfect, thanks!

    this is a screenshot of how they look like:

    That's exactly what I needed.

    As to the TM created with the filtered tmx file got from the whole original tmx, have you encounter particular problems? I couldn't use the tmx file in Studio because of some problems within that file, so I'm wondering if you encountered any problem at all with it.

    Besides, have you also used Trados Workbench for the conversion of the filtered tmx file?

  • Hi ,

    Glad that worked for you. I didn't have any problems with the TMX in Studio or in Workbench. To achieve the result you shared I did use Workbench with the TMX. The exact process was this:

    1. Create a new TM in Trados Workbench
    2. Create the Fields you want in File -> Setup
    3. Import the TMX (this retains all the field values if the field names you created match those in the TMX)
    4. Upgrade the Trados TMW that was created by Workbench in Studio and now you get all the fields

    I sent you the Workbench TMs as well in case you wanted to upgrade them a different way.

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

  • Hi again ,
    thanks for your feedback.
    I'll keep the Workbench files as well as the sdltms.
    if the development team finds some way to process those tmx files within Trados Studio too, please keep me/us informed, thank you.
    I don't have Workbench but I have other files I would like to convert to get similar results.
  • Hi  and

    Sorry for replying to such an old threat, but I am having the same issue as you had. I used to work with Wordfast Pro, which showed the Celex numbers by default under "Notes". Are you still using Workbench to get around this problem? If yes, can you tell me where I can download it. I see lots of references to it, but no official looking download link.

    If there's a better solution within Studio 2021 I would love to hear that, of course. Thank you!

    Mark

  • Could you please inform us where we can safely download Trados Workbench to solve this problem.

    emoji
  • How can I do the same thing via the API?

    the props in the original DGT .tmx are like:

    <prop type="Txt::Doc. No.">800580100</prop>

    and in my code I have:

    var tm = new FileBasedTranslationMemory(newTMpath);

    var docNo = new FieldDefinition();
    docNo.Name = "Doc. No.";
    docNo.ValueType = FieldValueType.SingleString;

    tm.FieldDefinitions.Add(docNo);

    tm.Save();

    Then I import the .tmx into a Trados TM, which I then export into .tmx:

    TMupdater tmupdater = new TMupdater();
    tmupdater.UpdateTMwithBilingualFile(newTMpath, tmxpath);

    var tmExporter = new TmExporter();
    tmExporter.Export(newTMpath);

    But it does not seem to preserve the original <prop type="Txt::Doc. No.">800580100</prop> attribute for the TUs of the final tmx. I also tried to modify the original .tmx and change the type attribute from "Txt::Doc. No." into "x-Doc. No.:SingleString", also without any success. Any help?

    emoji