Link previews remove fragment identifiers from URLs

See this post, and the mailing list post it originates from:

I tried to link to a specific comment in a GitLab issue, using a URL with a fragment identifier (#note_2772684). But the link preview changes/canonicalizes the link, losing the fragment identifier.

The URL I posted was:

https://bugs.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/40095#note_2772684

But the link preview changes the URL to

https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40095

Which looks like this, linking to the top of the issue, not the comment of 2022-01-26 I wanted:

My guess is that the forum is taking the og:url metadata property from the Open Graph tags on the page. The linked-to page has:

<meta content="https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40095" property="og:url">

Ideally, the forum would retain URL fragment identifiers even after canonicalizing the URL, in the same way the browsers keep the fragment after following redirects.

Notice how another URL in the post, https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Process%20Properties, does not have Open Graph tags, does not get a link preview, and keeps its fragment identifier.

1 Like

It looks like Discourse/Oneboxer has a setting somewhere, preserve_fragment_url_hosts:

  def self.preserve_fragment_url_hosts
    @preserve_fragment_url_hosts ||= ['http://github.com']
  end

Possibly related:

https://meta.discourse.org/t/why-instagram-link-onebox-like-this/138047/6

It seems that some Instagram pages have a canonical link with a different url that requires login. The code in the onebox library prefers the canonical url.

1 Like