OSINT: Online Tracking and Behavioral Profiling

Dr. Varin Khera November 2, 2020

This is part 3 of our series of articles on OSINT. Find all articles here.

As the world continues to digitalize, the practice of web tracking has grown increasingly to an extent that threatens people’s privacy. In today’s digital age, anything you do online can be tracked and recorded! Yes, you read it correctly, anything, and if you think activating the (Private) incognito mode in your web browser will prevent others from tracking you online, I’m afraid you are wrong.

There are different actors interested in tracking and recording internet user’s activities such as online advertisers, government agencies, website owners (including social media platforms), search engines and Internet Service Providers (ISP). Online advertisers are the main group, they record internet users browsing activities to formulate a complete online “profile” for each connected user. Later these profiles will be used to display tailored advertisements to internet users based on their online behavior.

Online tracking poses serious privacy concerns for the general public. For instance, sensitive information is usually collected, such as financial and health information. Besides, anything an internet user asks when using search engines will also get recorded and added to his/her online profile. This tracking profile will uniquely distinguish an internet user whenever he/she gets online. The general public thinks that formulating internet profiles for their browsing activities does not impose a privacy risk, as the collected data is anonymous and it cannot be linked back to their real-world personality, however, this is not always true. Online trackers can easily link historical browsing data of any internet user to its real identity using various methods.

While tracking ordinary internet users has become a privacy concern for the general public, online investigators (such as digital forensics investigators and OSINT gatherers) who surf the internet to collect intelligence for a variety of reasons, need to be very careful in preventing outside observers from tracking their online activities. The first prerequisite of any OSINT gathering task is to secure your digital footprint (conceal it) and to do that correctly, you must know how others can track you online.

Trackers employ various technologies to track internet user’s behavior, new techniques are constantly developing and many of them are hard-to-detect techniques and can track an internet user across various devices (e.g. laptop, smartphone, smartwatch and any Internet of Thing (IoT) device such as heath devices). In the following section, we will introduce the most common tracking methods already in use.

Online tracking techniques

Here is a brief of the most common techniques of online tracking.

IP address

Whenever you connect to the internet, your computing device can be identified by a unique number called the Internet Protocol Address – or IP address.

An IP address is a unique number that distinguishes your device when connecting to the internet. For instance, no two devices can have the same IP address on the same network. There are two versions of IP, IPv4 (32 bits long) which has the format (192.168.1.1), and IPv6 (128 bits long) which has the format (0:0:0:0:0:ffff:c0a8:101).

IP addresses come in two types: static and dynamic. A static IP address does not change and remains the same even after the user reboots his/her computing device or router. Static addresses are commonly used in email and file storage servers.

A dynamic IP address changes every time a user reboots his/her computing device, it is assigned by your network administrator or your ISP for a limited time. Dynamic addresses are commonly assigned using the Dynamic Host Configuration Protocol (DHCP) which is a network protocol used to dynamically assign an IP address for each device on the network. Most internet users use a dynamic IP address when connecting to the internet.

To know whether you are using a static or dynamic IP address, type ipconfig /all on your Windows command prompt and then hit the Enter button, navigate to your current network connection and search for DHCP, If DHCP Enabled parameter is set to Yes (see Figure 1), then you most likely have a dynamic IP address.

Figure 1 – Checking connection settings under Windows OS

An IP address is the first option used by online trackers to track internet users browsing activities, however, we cannot consider an IP address as a sole unique identifier of internet users. For example, an IP number can be concealed using Virtual Private Connection (VPN) or by using anonymity networks such as the TOR network. This fact made online trackers use other techniques – in addition to the IP address – to recognize online users as we are going to see next.

Cookies

This is one of the oldest techniques used to track internet users. In its simplest form, a cookie is a small text file stored on a user web browser when visiting a website – which uses cookies – for the first time. A cookie was invented to remember user’s preferences when visiting a website, so website owners can personalize their contents – such as user location, language and theme preferences – when the user returns to the same site again. A cookie file can store different information according to its type. Simple text cookie files contain the name of the URL (domain name) the cookie belongs to in addition to the cookie expiration date (see Figure 2).

When the website you visit directly installs the cookie on your device, then this cookie is called a first-party cookie, while third-party cookies are installed by websites other than the one you are currently on. Third-party cookies are the ones that impose privacy risks for internet users because they can track users browsing history across multiple websites.

Cookies can be grouped according to its expiration date to:

Session cookies: This type has no expiration date; session cookies are deleted automatically when the user terminates the session or close the web browser.

Persistence cookies: Usually used to store user’s settings (language, theme, menu preferences) and to facilitate other useful functions such as authentication. A Persistence cookie can live for a long time (till reaching its expiration date). Many third-party advertisers set no expiration date for their cookies making it live for good unless a user deletes it. The most known type of persistence cookies are Flash cookies. Figure 3 demonstrates how to access all Flash cookies stored on your Windows computer.

Figure 3 – Viewing all Flash cookies stored on a user device under Windows OS (all versions)

Social media platforms like Facebook use third-party cookies to track internet users across the internet. Facebook achieves this technically through its “like” and “Share” buttons already spread all over the internet. Facebook includes a small tracking code (JavaScript) within its buttons, so whenever a user visits a website that holds these buttons, Facebook will automatically record this action and begin tracking the user even though he/she does not own a Facebook account.

Online trackers commonly use cookies along with IP addresses to track internet users browsing history more accurately.

ETag

An entity-tag is an HTTP mechanism that provides web cache validation to increase internet surfing speed of end-users when visiting supported websites. Etag works by comparing the resources (images, videos, and audio files) existing on an end-user machine with the one existing on the visited website, if the version of the local resources – stored on end-user device – is the same, then there is no need to download the resources again from the webserver (see Figure 4).

ETags can be exploited by online trackers by forcing the tracking server to send continuous ETags even though nothing changed on the webserver. This maintains an active connection (session) between the end-user device and the tracking server that lasts indefinitely without user knowledge.

Figure 4 – Viewing Wikipedia website header info using Firefox browser extension HTTP Header Live (https://addons.mozilla.org/en-US/firefox/addon/http-header-live) – Both ETag and IP address tracking is used in this example.

Browser Fingerprinting

Also known as “Browser Fingerprinting”. In this type, an individual computing device is uniquely identified online using its technical specifications/settings. Fingerprinting works by running a code (usually JavaScript, Java applet or Flash code) inside the user’s browser. Upon execution, this code will extract different technical information about the targeted web browser and device settings such as screen resolution, installed web browser add-ons, installed fonts, language settings, time zone, location, browser type, operating type/build number among other things. After then, a hash is generated based on the collected information which is used to track internet users across the internet.

Device fingerprinting allows trackers to track internet users transparently without using cookies or IP addresses. Some people may think that the technical data collected from end-users devices using this method is generic and cannot be used to distinguish a user’s computing device among millions of connected devices. However, this assumption is not accurate. For instance, in a paper EFF released in 2010, they found that most web browsers used in their experiment can be uniquely identified using this technique. Although the study is somehow old, it is still perfectly valid today, especially with the continual development of fingerprinting technologies over time.

There are different online services to see your current device/browser fingerprinting, the following are the most popular one:

PANOPTICLICK (https://panopticlick.eff.org)
AmIUnique (https://amiunique.org)
Browserleaks (https://browserleaks.com) (see Figure 5)

Figure 5 – using https://browserleaks.com to check your current browser fingerprint

Conclusion

Online tracking is an important subject to understand for both cybersecurity professionals and end-users. For internet investigators, knowing how to conceal your online traces is very important before beginning your search, and to do that correctly, you should first understand how others can track your activities online.

In this article, we tried to give a high overview discussion of the most common web tracking technologies. In the next article, we will continue the discussion and teach you how to prevent outsiders from recording your browsing activities using a plethora of tools and techniques.

Dr. Varin Khera

Chief Strategy Officer ITSEC Group / Co-Founder ITSEC Thailand c | Website

Dr. Khera is a veteran cybersecurity executive with more than two decades worth of experience working with information security technology, models and processes. He is currently the Chief Strategy of ITSEC Group and the Co-founder and CEO of ITSEC (Thailand). ITSEC is an international information security firm offering a wide range of high-quality information security services and solutions with operation in Indonesia, Malaysia, Philippines, Singapore, Thailand and Dubai.

Previously the head of cyber security Presales for NOKIA, Dr. Khera has worked with every major telecom provider and government in the APAC region to design and deliver security solutions to a constantly evolving cybersecurity threat landscape.

Dr. Khera holds a Doctor of Information Technology (DIT) from Murdoch University, a Postgraduate Certificate in Network Computing from Monash University and a Certificate of Executive Leadership from Cornell University.

Dr. Khera was one of the first professionals to be awarded the prestigious Asia Pacific Information Security Leadership Awards (ISLA) from ISC2 a world-leading information security certification body under the category of distinguished IT Security Practitioner for APAC.