February 20 2025

8 mins

SEO

What is user agent? Importance of user agent in SEO | Salestube

Jan Susmaga

#SEO

Jan Susmaga

Junior SEO Specialist

What Is a User Agent? The Importance of User Agents in Desktop Browsers, Mobile Devices, and SEO

A user agent is a crucial component of the HTTP communication process. Essentially, it’s the string of text sent by a web browser (or any other client) to identify the operating system, browser type, and version number during an HTTP request. Understanding the importance of user agents is vital for technical SEO, because they influence how bots and crawlers see your website—and how content is served to different devices, from desktop machines on Windows NT 10.0 (Win64, x64) to an Android mobile device.

This article explores why user agent data matters for desktop browsers and mobile web browsers, how it fits into an HTTP header, and how SEO professionals use it to configure robots.txt. Whether you’re running a site on a web server or performing an audit, you’ll find plenty of tips and best practices here.

User-Agent Basics: Where It Fits in the HTTP Header

Understanding the Role of Mozilla and Other UA Strings

Every time a web browser makes a request, it includes a request header that contains fields like User-Agent, Accept, and Host. The user-agent header is especially significant because it identifies the application making the request (e.g., Google Chrome, Mozilla Firefox, Safari, or Microsoft Edge).

Here’s an example of a request header for the homepage of Salestube:

GET / HTTP/1.1

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed- exchange;v=b3;q=0.7

Accept-Encoding: gzip, deflate, br, zstd

Accept-Language: pl-PL,pl;q=0.9,en-US;q=0.8,en;q=0.7

[...]

Host: salestube.tech

[...]

User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36

[...]

In our case, this is a request for access to the content located at the URL salestube.tech.

1. GET / HTTP/1.1 informs the server about the method being used—GET (“send me...”)—as well as the version of the HTTP protocol used to establish the connection. Other common methods include POST (“I’m sending you...”), DELETE (“remove from your resources...”), and more.

2. Accept, Accept-Encoding, and Accept-Language provide information about the preferred format in which the client wants to receive the server’s response. This includes file formats, acceptable compression methods, and the language in which the content should be delivered.

3. Host specifies the website’s name according to DNS records, including the domain and, if applicable, a subdomain.

4. And finally, the element that is the focus of today’s discussion—the User-Agent. We’ll cover this in detail below.

User Agent - Details

The user-agent header provides the server with information about the name of the device or application that is compatible with it—meaning the device or software that can be used with that user agent to access the content. It is present in every request made using the HTTP protocol. The user agent string follows a standardized format and includes details about the software. It typically also includes the operating system (e.g., Windows, Mac, Linux), version, or extensions. The format for creating its values is as follows:

User-Agent: <product>/<its version><comment>

Notice references such as “Windows NT 10.0,” “Win64,” and “x64,” which indicate a desktop environment on a 64-bit machine. The mention of “Chrome” and “Safari” also shows that modern user agents often blend references for compatibility with various rendering engines, including “KHTML” or “like Gecko.” The convention is to list devices from the most important to the least significant.

Which devices can use user agents? Essentially, any device from which we can make a network request using the HTTP protocol—ranging from a web browser on desktop to scripts in a computer terminal (like cURL). This could even include specialized tools or bots.

Let’s take a look at the ones used by the most popular browsers. It’s also worth noting that User Agents can vary depending on the system we are using.

Google Chrome

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36

Mozilla Firefox

Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:133.0) Gecko/20100101 Firefox/133.0

Microsoft Edge

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36 Edg/131.0.2903.86

Safari

Mozilla/5.0 (iPhone; CPU iPhone OS 13_5_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Mobile/15E148 Safari/604.1

User Agents and SEO: Configuring robots.txt for Bots and Crawlers

A very specific point of interaction between an average SEO professional and user agents is the robots.txt file. We use this simple text file to control which bots (or crawlers) have access to our website, while denying access to others. We can recognize and manage these bots by their user agent string, which is usually public and can mention details like “Mozilla,” “Chrome,” “Safari,” “like Gecko,” or an operating system such as Windows NT 10.0 (Win64; x64) if it were a standard desktop web browser. Fortunately, we don’t have to write long lines of text in the robots.txt—aliases or short labels are enough.

Below are the most important crawler identifiers that Google uses, each appearing in the User-agent field of an HTTP header when the server receives a request. Even though they don’t look like typical UA entries (with references to KHTML or “like Gecko”), they still function as user agents that let Google’s indexing navigation know what content to fetch.

User-agent: Googlebot
Controls the main user agent for general web crawling (Googlebot Smartphone and Googlebot Desktop).
User-agent: Googlebot-Image
Handles the crawling of elements eligible for Google Images, Google Discovery, Google Video, and other extended graphical elements in the classic SERP.

If we include an allow directive for the "classic" Googlebot, we don’t need to configure this directive separately—unless we want to prevent crawling certain directories of our website by this bot.

User-agent: Googlebot-Video
Focuses on search results related to videos. The configuration situation is similar to the one mentioned above.
User-agent: Googlebot-News
Crawls to display content in Google News. Again, the configuration is similar to the previous cases.

In practice, each bot is recognized by its user agent string—but from an SEO perspective, you only have to specify these short names in robots.txt. Whether Googlebot is mimicking a desktop environment or a mobile device, and whether it references “Mozilla/5.0” or “like Gecko,” the server uses that header to decide how to respond. This direct link between user agents and robots.txt is why SEO specialists pay close attention to both.

Which Directives Can We Use to Control Crawlers?

Crawlers—whether they appear as a user agent with “Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/XX.0.0.0 Safari/537.36 (like Gecko)**” or a simple bot name—properly recognize only four directives in a robots.txt file:

User-agent: Identifies the crawler (or bot) that should follow the stated instructions.
Disallow: The first directive. It means: don’t go here.
Allow: The second directive, the opposite of the previous one.
Sitemap: The URL identifier where the sitemap can be found. There can be more than one.

The essence of robots.txt lies in configuring Allow and Disallow correctly. By default, an Allow directive specifies which directories or sections the user agent can visit, effectively blocking off everything else. A Disallow directive works in reverse, opening up the entire site while prohibiting access to certain areas. Both can also be used to block the entire site if desired.

Which of these directives offers more flexibility? How can you combine them to achieve the desired result—particularly if you’re controlling multiple crawlers that each have a unique user agent string (e.g., different references for Safari, like Gecko, or Win64)? Let’s see this in action with real-world examples in the next sections.

What Does a Sample Configuration Look Like?

Directives are written by listing paths to folders. We can also use something like simplified regular expressions to target specific files. But let’s go step by step.

Let’s take a look at the file placed on our own website—and modify it a bit for the purpose of this exercise. We’ll also create two new bots: UnwantedCrawler and Newsbot. Let’s assume that only these two, along with Google bots, are crawling the entire internet.

Default Configuration

Below is an example robots.txt file placed on our own website. We’ll pretend only two custom bots (UnwantedCrawler and Newsbot), along with Googlebot, are crawling the entire internet. Initially, the robots.txt allows all bots to access all content:

User-agent: *
Allow: /

Sitemap: https://salestube.tech/sitemap.xml

Let's go over a few scenarios that we may encounter during configuration.

Blocking part or all of the site for one of the bots.

UnwantedCrawler definitely shouldn't have too much access to our site. Assuming it shouldn't visit our blog articles located in the /blog/ directory at all, we should add the following to the file:

User-agent: unwantedcrawler
Disallow: /blog/

Or maybe we don't want it to access the site at all— in that case, the entire configuration would look like this:

User-agent: *
Allow: /

User-agent: unwantedcrawler
Disallow: /

Sitemap: https://salestube.tech/sitemap.xml

Blocking one part of the site, unblocking another.

A seemingly similar situation, but slightly more complex. Newsbot is desired by us, but it doesn’t need to crawl everything. Everything it should be interested in is located in the (hypothetical) /news/ directory. So, we add the following:

User-agent: *
Allow: /

User-agent: unwantedcrawler
Disallow: /

User-agent: newsbot
Allow: /news/

Sitemap: https://salestube.tech/sitemap.xml

Furthermore, we want it to have access to the directory with images that we use in the news. In that case:

User-agent: *
Allow: /

User-agent: unwantedcrawler
Disallow: /

User-agent: newsbot
Allow: /news/
Allow: /static/photos/

Sitemap: https://salestube.tech/sitemap.xml

...but in the image directory, we have images in two formats - .jpg and .tiff. The latter should not be of interest to our bot at all. So, we can use a very specific formula:

User-agent: *
Allow: /

User-agent: unwantedcrawler
Disallow: /

User-agent: newsbot
Allow: /news/
Disallow: /static/photos/
Allow: /static/photos/*.jpg$

Sitemap: https://salestube.tech/sitemap.xml

How should we understand this combination of disallow and allow? As follows: block access to /static/photos/ except for URLs that end with .jpg. The asterisk (*) replaces any sequence of characters (in our case, file names). The dollar sign ($) indicates how the URL must end. Simply put: as long as the URL in the /photos/ folder ends with .jpg, crawl it; ignore the rest.

Finally, let’s add one more thing: none of the crawlers should care about URLs with a specific analytic parameter. To achieve this, you just need to add a line referring to all bots with the specific parameter. For example:

User-agent: *
Allow: /
Disallow: /?srsltid=

It’s important to be precise here. A question mark is often used for pagination or other query strings, so you may not want to accidentally block legitimate pages. But if a bot sees “/something/?srsltid=,” this directive prevents it from crawling that specific parameterized URL—regardless of whether the user agent is a standard desktop crawler referencing Windows NT 10.0 (Win64; x64) or a custom UA for something like Safari, Firefox, or a specialized download manager.

Does Changing the User-Agent Affect How Websites Are Displayed?

To answer briefly—and very much in the spirit of SEO—it depends. An SEO specialist might wonder if a mobile device sees different web pages compared to a desktop running Windows NT 10.0 (Win64; x64) or macintosh (Intel Mac OS X). Could key elements like the Page Title, headings, or noindex attributes change when using a user agent switcher? Technically, yes—if a web server is programmed to serve different content based on the reported user agent. But what does this look like in practice? Let’s analyse a bit deeper.

Responsive Design vs. Serving Different Versions

The starting point is responsive design: a single version of the site that automatically adjusts layout, font sizes, and images for different browsers or screen resolutions. Whether someone is browsing from a mobile phone, a desktop, or mobile web browsers like Safari on iOS (e.g., “CPU iPhone OS ... like Mac OS X”), the core navigation experience remains consistent. Targeting the user agent string directly—like checking for “Mozilla/5.0 (X11; Linux x86_64)” or “(Windows NT 10.0; Win64; x64) like Gecko Chrome”—is considered a more rigid solution, and widely discouraged in modern SEO.

Why Separate Versions Can Be Problematic

One alternative is preparing completely distinct site versions: one for desktop browsers (e.g., “Mozilla-compatible” references) and another for mobile devices (like “Android” or “cpu iphone os”). Then you’d serve different layouts based on the UA. While there are no major technical restrictions against this, it’s often seen as bad practice. Google primarily uses mobile-first indexing now, and a user agent only indicates compatibility, not actual OS parameters or real screen dimensions. So even if you mimic a desktop mode using a user agent switcher, it doesn’t guarantee accurate detection of the version of the requesting user environment.

It (Usually) Won’t Change the Underlying Content

Because modern browsers and mobile web browsers rely on responsive design for consistent rendering, simply changing the user-agent won’t typically alter the availability of crucial elements—like the Page Title, meta tags, or noindex. Unless a site specifically implements serving different content to, say, “Netscape” or “Chrome/XX.0.0.0 (like Gecko)**,” the user experience should remain the same. In other words, there’s no inherent SEO benefit to spoofing a user agent for normal browsing.

Thus, the short answer is: changing the user-agent typically doesn’t change what’s available on any given page, nor should it. Web browsers use the user agent for compatibility, but it shouldn’t impact core site content. Instead, we rely on responsive design techniques—such as CSS media queries—to ensure a single, adaptive layout for desktop and mobile alike.

Can I change the user agent myself (also in the context of web scraping)?

We mentioned different versions of a page depending on the User-Agent value. If we wanted to check this parameter, we would need to switch between different user-agents (at least between the desktop and mobile versions). Fortunately, we can easily do this in a browser. Here’s a short guide for computers:

Right-click anywhere on the screen in the web browser.
The developer tools panel will appear. In the top right corner, click on the gear icon - this opens settings.
In the settings, select devices and click "Add new device". In the window, you can enter any supported string.

Manually browsing each subpage, however, is hard to call an efficient use of time during an SEO audit. Fortunately, crawling tools like Screaming Frog also allow you to change the user-agent. This way, we can easily, conveniently, and in bulk check if there are any significant discrepancies between the mobile and desktop versions. This option can be found in the "User-Agent" tab in the settings. However, it’s worth noting that this feature is only available in the licensed version.

A separate paragraph is worth dedicating to user-agents in the context of web scraping. The Google user-agent can indeed open a few more metaphorical doors for us, but don’t count on it being a universal, magic master key. Some websites use much more sophisticated protections against scraping, such as encrypting sensitive content (phone numbers, emails). Nevertheless, using browser user-agents instead of the default ones is a good practice.

Questions and answers about user-agents.

Finally, let's focus on a few questions that may have arisen during the reading.

Where can I check my current user-agent?

Typing “what’s my user-agent” in Google Chrome or mozilla firefox will reveal it. Developer Tools also display it under the network tab.

Can changing the user agent help bypass regional blocks?

No—your IP address dictates your region. The characteristic string known as the user agent does not include geolocation data.

Can the user-agent be used to track my online activity?

In a sense, yes. User-agents can be "captured" for analytical purposes and for resolving potential issues with a website. After all, as mentioned earlier, some websites block access to scraping bots based on them. However, user-agents will have no value for marketers or other customer journey analysts because they only appear in the HTTP request headers. The only thing that can be learned from them is that we used a device compatible with a particular user-agent. There are usually many such devices, especially since most of us use smartphones or computers with browsers to browse the internet. There is no talk of personalization or tracking the user journey here.

Q: What about older or specialized devices?

Older phones or niche systems like Firefox OS can have unique UAs referencing “netscape” or distinct tokens like “mozilla-compatible.”

Q: What resources can help me validate my robots.txt or user-agent configurations?

A good validator can check your robots.txt. For more detail, see also MDN (Mozilla Developer Network) for guidelines. You’ll find related links to official docs that explain how user agents function, how to handle a web server for multiple devices, and best practices for responsive design.

Do you have any questions?

Do you want to deepen this topic? Write!

hello@salestube.tech

You liked the article, share it with:

Facebook

Back to list

Company

Services

Group