A URI is a universal way of describing a resource using either a location (URL) or name (URN). A URL provides the location of a resource similar to the way a street address describes a physical address. A URN describes a resource that is not necessary reachable. Think of a person’s name. URN can be converted into URLs via a resolver that knows how to translate a name into a location. A URL describes a location. A The most common is a web document, however mailto:firstname.lastname@example.org is an example of a url that is not a location.
The standard pattern for any URI is:
The format for an urn is:
where urn is the scheme, nid is the namespace identifier, and nss is a string which is syntactically interpreted based on the namespace id. URLs are a bit more complex, since there are multiple schemes that can be used such as; http, https, ftp, etc. The format is:
URLs and URNs can be broken into components:
foo://example.com:8042/over/there?name=ferret#nose \_/ \______________/\_________/ \_________/ \__/ | | | | | scheme authority path query fragment | _____________________|__ / \ / \ urn:example:animal:ferret:nose
A URL parser can be used to break an URI into its components.
There are three formats for URNs:
Standard URNs follow the W3C URN standards. They start with the scheme urn: followed by a namespace identifier (NID) and then a namespace specific string (NSS). Standard URNs are designed to ensure global uniqueness, as such, their NIDs are required to be registered with the IANA. Intelligence needs URIs that designate items local to a specific website such as posts and forms on a site. Intelligence also needs to reference items on other sites that do not have official URNs such as embedded YouTube videos. To resolve this, a modified URN pattern is used called an Internal URN. Internal URNs refer to platform resources resolved within WordPress. The pattern drops the standard namespace id (nid) in the first element of the path and adds a platform defined internal namespace (inid) as the as the second element. urn::[internal namespace]:[internal namespace specific string] This can be condensed by dropping the initial urn:. The condensed version is the preferred version for analytics tracking: :[internal namespace]:[internal namespace specific string]
Standard URN for the book The Last Unicorn identified by ISBN
Internal URN of post with post_id=1 in a WP site:
Internal URN of Gravity form with form_id=1
Internal URN for YouTube video found at https://youtu.be/n_Isqkuija8
Internal URN for Google Analtyics Goal index 1:
URLs should be shortened to their most condensed form to be stored as a URI. If a path is for an internal page, the URL should be represented as an relative path without scheme and host. For a website residing on http://example.com the link:
should be shortened to:
Absolute URLs, should have their schemes removed thus starting with a //. Intelligence supports the concept of a system path which is a relative URL to uniquely describe a system entity. In general system paths are an alternative way to describe an internal URN. For example, the URN for the post post_id=1 is :post:1 and the system path is post/1.
Absolute URL to a website:
Absolute URL to the blog on a website:
Relative URL to the blog on the current site
System path to the blog list on the current site
System path to the post with post_id=123
Determining URL and URN by path alone
Google analytics historically is setup to store URL locations in two dimensions: host & path. GA does not store the scheme, therefore there is no direct way to know if the path belongs to a URL or URN. GA also does not have dimensions for query or fragment, therefore, query and fragment data is stored in the path, e.g. the path stores everything after the authority/host. Much of Intelligence’s internal storage structures follows GA’s host, path structure. So a convention must be created for how to format the path in a way to determine if it is an URL or URN. The convention is that anything that starts with a colon (:) or urn: is an URN path and if not, an URL path. If a URL path starts with a double slash (//) it is an absolute path. If a single slash (/), a relative path and does not start with a colon (:) or slash (/) a system path.