Importing words containings links to Trados Studio - why are some links automatically shortened?

Hello everyone,

I had the following problem when importing Word documents containing words/titles with hyperlinks into Trados Studio (e.g. for Alignment): If the links contained a hashtag (#), the link text which came after the hashtag disappeared. Below an example:

Original link in Word: https://blahblahblah#something

Link displayed in Trados Studio: https://blahblahblah

Has anyone encountered this issue? Is there a way to change the import settings so that the whole link appears, not just the part before the hashtag?

Any help would be very much appreciated! Slight smile

emoji
Parents
  •  

    Can you provide a couple of small documents we can test in alignment to validate this?  Can you also tell us what version of Trados Studio you are using?

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji
  • Hi Paul,

    I am using Trados Studio 2022.

    Bellow is a Word document with a link that is automatically separated at the # sign. 

    The link should look like this: https://gizonline.sharepoint.com/sites/group_1191/SitePages/Personalpolicies.aspx?panel=p-diversity#p-diversity

    But the way it is imported looks like this: 

    https://gizonline.sharepoint.com/sites/group_1191/SitePages/Personalpolicies.aspx?panel=p-diversity

    p-diversity

    Thank you in advance for your help!

    emoji
Reply
  • Hi Paul,

    I am using Trados Studio 2022.

    Bellow is a Word document with a link that is automatically separated at the # sign. 

    The link should look like this: https://gizonline.sharepoint.com/sites/group_1191/SitePages/Personalpolicies.aspx?panel=p-diversity#p-diversity

    But the way it is imported looks like this: 

    https://gizonline.sharepoint.com/sites/group_1191/SitePages/Personalpolicies.aspx?panel=p-diversity

    p-diversity

    Thank you in advance for your help!

    emoji
Children
  •   

    Thanks for this.  I also get the same results as you.  The target file is correct, so technically there isn't a problem here.

    In a URL, the # symbol is used to denote a fragment identifier.  A fragment identifier specifies an anchor within the resource that the URL identifies, in other words it allows a web browser to jump directly to a specific section on a web page.

    For example, in your URL:

    https://gizonline.sharepoint.com/sites/group_1191/SitePages/Personalpolicies.aspx?panel=p-diversity#p-diversity

    The #p-diversity part means that the browser should automatically scroll down to the section of the page with an HTML element that has an id attribute set to p-diversity.  It could be a header, a paragraph, or any other HTML element with this particular id.  But the main point being this id attribute could be translatable so Studio segments it and hides the # symbol since we don't need it and want to make the links more readable (translatable).

    So I think it's correct to display it like this:

    Screenshot showing the segmentation of the URL

    In your original post you gave the impression the fragment identifier was missing altogether, but it's not.  It just gets segmented.  If you want t avoid this to make the alignment easier then don't extract the hyperlinks by unchecking this option:

    Screenshot showing the filetype option to exclude hyperlinks - it's unchecked.

    This way when you open the file it'll only extract this and you won't have to deal with the URL at all:

    Screenshot showing the segmentation without the extracted URL

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

    emoji