Posted on

Metadata for Beginners: What It Is and Why It Matters

If you’re starting your journey in cybersecurity, digital investigations or ethical hacking, you’ll often hear the term metadata. At first, it sounds technical. But the concept is actually simple and incredibly important. Metadata can reveal hidden details about files, emails, images and documents that many people never notice.

In this beginner-friendly guide, you’ll learn:

  • What metadata is
  • Why it matters
  • Common types of metadata
  • Where metadata is found
  • How cybersecurity professionals use it

What is Metadata?

Metadata is simply Data about data. That sounds abstract, so let’s make it easier.

Imagine a photograph. The photo itself is the main data. Metadata is the hidden information attached to that photo, such as:

  • Date it was taken
  • Device used
  • File size
  • Image dimensions
  • GPS location (sometimes)

So:

The content is the data.
The descriptive information is the metadata.

Simple Real-World Example

Think about a book. The book’s text is the actual data. Metadata includes:

  • Title
  • Author
  • Publication date
  • ISBN
  • Number of pages

That information describes the book. The same idea applies to digital files.

Why Metadata Matters

Metadata can reveal a surprising amount of information. It helps people,

  • Organize files
  • Search efficiently
  • Track changes
  • Investigate activity
  • Understand context

In cybersecurity, metadata can provide valuable clues.

Common Types of Metadata

Metadata exists in many forms. Let’s break down common categories.

1. File Metadata:

Most digital files contain descriptive information. For example,

  • File name
  • File size
  • Creation date
  • Modification date
  • File type
  • Author information

This helps systems manage files efficiently.

2. Image Metadata:

Photos often contain hidden technical details. Examples:

  • Camera model
  • Device type
  • Resolution
  • Timestamp
  • GPS coordinates
  • Editing software

This is often called EXIF metadata.

3. Document Metadata:

Documents can contain useful embedded details. Examples:

  • Author name
  • Editing timestamps
  • Software version
  • Revision history
  • Company information

Common in:

  • PDFs
  • Word documents
  • Presentations

4. Email Metadata:

Emails contain hidden technical information beyond what you see. For example,

  • Sending server details
  • Message path
  • Timestamps
  • Sender routing information
  • Authentication data

Useful for email investigations.

5. Website Metadata:

Websites also contain metadata. Examples:

  • Page descriptions
  • Keywords
  • Open Graph tags
  • Structured data

Used for:

  • Search engines
  • Social sharing
  • Content indexing

6. System Metadata:

Operating systems track metadata too. Examples:

  • Access times
  • File ownership
  • Permissions
  • System timestamps

Useful for troubleshooting and analysis.

Where Can Metadata Be Found?

Metadata appears in many places. The most common sources are,

Images:

Photos often carry hidden embedded data.

Documents:

Office files frequently store author and revision information.

Emails:

Headers contain metadata.

PDFs:

Creation and editing information may be stored.

Audio / Video Files:

Media files may include:

  • Duration
  • Encoding details
  • Device information

Websites:

HTML metadata helps search engines understand content.

Hello aspiring Ethical Hackers. In our previous blog post, you learnt what is footprinting, why it is important and different types of footprinting techniques. In this blog post, you will learn about performing Footprinting using Metadata.

What is Metadata?

Metadata is a set of data that provides information about other data. Simply put, it is data about the data. Everyone knows data is very important but metadata is often ignored but equally important. But how is metadata helpful to Ethical Hackers. Before going there, let us see how to extract Metadata.

How to extract Metadata?

There are various tools and online resources that extract metadata from different files. For this article, let’s use one tool that is inbuilt in Kali Linux, exiftool. Exiftool extracts metadata from a number of file types.

Let’s extract metadata of a docx file.

Now, let’s extract it from a PDF file.

Let’s see another PDF file.

Last and final, let’s use it on an image file.

How is it useful in pen testing?

If you have noticed, we have performed metadata extraction from 3 types of files: Docx, PDF and an Image. That’s because these are the most common types of files that are available online. Any organization uses these types of files on their websites or anywhere else to convey information.

While extracting information of the docx file revealed the names of creators of the file (Admin, Kalyan). This revelation can help in gaining access later (i.e username is admin etc) or to perform a spear phishing attack targeted at the target user. We can also see that the document was created using Microsoft Word software. So, we can target these users with a malicious macro attack.

While observing the information extracted from a PDF file, we can see that this PDF was created using Microsoft Word. In this case, the version of the MS Word software is also very clear (2019) along with the creator’s name.

The second PDF file was created using Microsoft PowerPoint. So, we can figure out that these users need to be targeted with PowerPoint attack.

Images are another most common types of files found on a website or any other company’s property. We can see that the image I downloaded from a website is either edited or created with Photoshop along with its specific version. So, we can search for any vulnerabilities in this particular software or use this software themed lure to target this organization.

That’s how Metadata can help Pen testers in gaining information about the target organization.

Follow Us