Metadata Tools of the Trade

Metadata Tools of the Trade
By (Dr. Neal Krawetz)

Evaluating metadata requires more than just loading a picture and pushing a button. While there are lots of tools that display metadata, they do not all display the same things, and there are a few things that makes evaluating metadata complicated. Problems occur when analysts don’t take the time to identify what created the different pieces of metadata, what the metadata components describe, and what the programs that extract metadata are really showing.

The Most Common Mistake

Different types of metadata come from different sources. A single picture may contain EXIF metadata, JFIF metadata, XMP, MPF, and other types of metadata. Some of these metadata structures come from cameras, while others come from applications. Just identifying what created a particular metadata block may be difficult.

The various fields within the metadata increase the complexity. Within each type of metadata are sets of data fields, and each field has a name. However, determining the purpose of a field may be non-intuitive. Often, the field has a well-defined name that is document in a formal standard, but the field’s name can still be confusing. For example, what does a field named “Date/Time” represent? Depending on the metadata block, it could be when the file was created, when it was last modified, when the metadata was added, or when the event depicted in the photo occurred.

Over the last month, there has been one big mistake that I’ve repeatedly seen. And it concerns one specific type of metadata: the ICC Profile.

The ICC Profile metadata block defines how to adjust the picture’s coloring. When an ICC Profile is applied, the “red” seen on one monitor will look the same as the red on another monitor and as close as possible to the same color when the picture is printed.

Virtually no digital cameras natively embed an ICC Profile metadata block. When you see a picture with an ICC Profile included as metadata, you should immediately assume that it was added by an application. (I say “virtually no digital cameras” because there are one or two high-end cameras that natively attach ICC Profiles when the picture is captured, and even those are disabled by default. An ICC Profile is virtually never added by a camera.)

Because an ICC Profile is generated independently of the picture it is attached to, the timestamp in the ICC Profile identifies when the profile was created — and not when it was attached to the picture or when the picture was photographed. Unfortunately, it is common for users to view the metadata, see the ICC Profile timestamp, and assume that the picture is fake because the picture doesn’t depict an event that happened when the ICC Profile was created.

To be clear: the only information that can be deduced from the ICC Profile’s timestamp is (1) the ICC Profile was generated at that time (assuming the timestamp is accurate), (2) it was added by an application (assuming it isn’t one of the extremely rare cameras that natively embeds a color profile), and (3) it was added to the picture some time after it was generated.

Some photo sharing services, like Facebook, automatically attach an ICC Profile to pictures. With Facebook, the ICC Profile says “Copyright: FB” (FB for Facebook) and the ICC Profile’s timestamp denotes when Facebook created the profile (many years ago). This timestamp doesn’t tell you when the picture was created. It doesn’t tell you when the picture was uploaded to Facebook, and it doesn’t identify when any modifications were made. At best, it tells you that the picture was uploaded to Facebook on or after that date (and likely long after that date). However, if the picture is supposed to be direct from a camera and it contains this Facebook indicator, then you can question why the picture isn’t from a camera and why it went through Facebook.

Similarly, if you adjust the display colors on a Mac, then the Mac generates a new ICC Profile. This new profile contains a timestamp that identifies when the monitor was calibrated. Some Mac applications attach this monitor-specific profile to pictures. When this happens, you can end up with a timestamp that doesn’t match the picture. If the ICC Profile’s timestamp is newer than the picture, then it means someone calibrated their Mac monitor after the picture was photographed but before it was post-processed on the Mac. (This is actually pretty common.) It doesn’t denote an inconsistency in the timestamps.

Common Tools, Common Problems

There are dozens of common types of metadata, and they are all in different formats. Not every metadata viewer supports every metadata format. As a result, some metadata viewers omit metadata from the analysis. Adding to the complexity, some metadata viewers rename standard fields — leading to ambiguity or confusion. And a few metadata tools alter the metadata before displaying it.

Not every metadata viewer is the same; some programs are better than others at displaying metadata. And some programs are so poor that you’re better off not using them at all. To understand each tool’s limitations, let’s evaluate a simple picture of a brick wall with some common metadata viewers:

(Click on the picture to view it at FotoForensics.)

Viewer: ExifTool

At the top of the “great tool” list is Phil Harvey’s ExifTool. This tool decodes more types of metadata than anything else out there. (And Phil’s constantly adding to it.) By default, the output is a list of metadata field and values.

ExifTool has tons of command-line options for organizing the data, displaying more data, and extracting parts of the metadata. This program is fairly technical and the command-line parameters can become pretty complicated; this is not a tool for non-techies. However, it is very complete.

The following ExifTool output shows the brick wall picture’s metadata grouped by type of metadata. At the beginning, ExifTool lists its own version number and any warnings. Then comes information about the file on the file system. After that comes the different types of metadata and the various fields and values. At the very end of the listing, ExifTool generates a summary of findings (in the “Composite” section).

---- ExifTool ----
ExifTool Version Number : 10.02
Warning : [minor] Bad format (1024) for MakerNotes entry 0
---- File ----
File Name : redbrick.jpg
Directory : files/
File Size : 4.3 MB
File Modification Date/Time : 2014:12:15 09:01:29-07:00
File Access Date/Time : 2015:01:13 07:09:10-07:00
File Inode Change Date/Time : 2015:01:17 18:52:37-07:00
File Permissions : rw-r--r--
File Type : JPEG
File Type Extension : jpg
MIME Type : image/jpeg
Exif Byte Order : Big-endian (Motorola, MM)
Current IPTC Digest : dd4e6fa7794eab3a6cb11eb9d861a220
Image Width : 4008
Image Height : 2968
Encoding Process : Baseline DCT, Huffman coding
Bits Per Sample : 8
Color Components : 3
Y Cb Cr Sub Sampling : YCbCr4:2:0 (2 2)
---- JFIF ----
JFIF Version : 1.01
Resolution Unit : inches
X Resolution : 72
Y Resolution : 72
---- EXIF ----
Image Description : Red Brick, Black Wall (Washington, DC)
Camera Model Name : E-620
Orientation : Horizontal (normal)
Software : GIMP 2.6.8
Modify Date : 2010:11:19 09:12:46
Artist : takomabibelot
Copyright : Creative Commons Attribution.
Exposure Time : 1/25
F Number : 6.3
Exposure Program : Program AE
ISO : 200
Exif Version : 0221
Date/Time Original : 2010:11:19 09:12:46
Create Date : 2010:11:19 09:12:46
Exposure Compensation : -0.3
Max Aperture Value : 3.5
Metering Mode : Spot
Light Source : Unknown
Flash : Auto, Did not fire
Focal Length : 180.0 mm
User Comment :
Flashpix Version : 0100
Color Space : sRGB
File Source : Digital Camera
CFA Pattern : [Blue,Green][Green,Red]
Custom Rendered : Normal
Exposure Mode : Manual
White Balance : Auto
Digital Zoom Ratio : 1
Scene Capture Type : Standard
Gain Control : Low gain up
Contrast : Normal
Saturation : Normal
Sharpness : Normal
GPS Version ID :
GPS Latitude Ref : North
GPS Latitude : 38 deg 57' 19.00"
GPS Longitude Ref : West
GPS Longitude : 77 deg 0' 36.34"
GPS Altitude Ref : Above Sea Level
GPS Altitude : 80.00085616 m
GPS Time Stamp : 14:12:46
GPS Satellites : 0
GPS Map Datum : WGS-84
GPS Date Stamp : 2010:11:19
---- PrintIM ----
PrintIM Version : 0300
---- IPTC ----
Keywords : geo:lat=38.95527906, geo:lon=-77.01009393, geotagged
By-line : takomabibelot
City : Washington
Sub-location : Chillum Station
Province-State : District of Columbia
Country-Primary Location Code : USA
Country-Primary Location Name : United States
Credit : takomabibelot
Copyright Notice : Creative Commons Attribution.
Caption-Abstract : Red Brick, Black Wall (Washington, DC)
Application Record Version : 4
Date Created : 2010:11:19
Time Created : 09:12:46-05:00
---- XMP ----
XMP Toolkit : Image::ExifTool 8.42
Country Code : USA
Creator City : Takoma Park
Creator Country : USA
Creator Postal Code : 20910-5107
Creator Region : MD
Creator Work Email :
Creator Work URL :
Location : Chillum Station
Creator : takomabibelot
Description : Red Brick, Black Wall (Washington, DC)
Rights : Creative Commons Attribution.
Subject : geo:lat=38.95527906, geo:lon=-77.01009393, geotagged
Color Space : sRGB
Contrast : Normal
Custom Rendered : Normal
Date/Time Digitized : 2010:11:19 09:12:46-05:00
Date/Time Original : 2010:11:19 09:12:46-05:00
Digital Zoom Ratio : 1
Exif Version : 0221
Exposure Compensation : -0.3
Exposure Mode : Manual
Exposure Program : Program AE
Exposure Time : 1/25
F Number : 6.3
File Source : Digital Camera
Flashpix Version : 0100
Focal Length : 180.0 mm
GPS Altitude : 80.0008561643836 m
GPS Altitude Ref : Above Sea Level
GPS Latitude : 38 deg 57' 19.00" N
GPS Longitude : 77 deg 0' 36.34" W
GPS Map Datum : WGS-84
GPS Date/Time : 2010:11:19 14:12:46Z
GPS Version ID :
Gain Control : Low gain up
ISO : 200
Light Source : Unknown
Max Aperture Value : 3.5
Metering Mode : Spot
Saturation : Normal
Scene Capture Type : Standard
Sharpness : Normal
User Comment :
White Balance : Auto
City : Washington
Country : United States
Credit : takomabibelot
Date Created : 2010:11:19 09:12:46-05:00
State : District of Columbia
Camera Model Name : E-620
Orientation : Horizontal (normal)
Software : GIMP 2.6.8
Create Date : 2010:11:19 09:12:46
Modify Date : 2010:11:19 09:12:46-05:00
---- Composite ----
Aperture : 6.3
Date/Time Created : 2010:11:19 09:12:46-05:00
GPS Altitude : 80 m Above Sea Level
GPS Date/Time : 2010:11:19 14:12:46Z
GPS Latitude : 38 deg 57' 19.00" N
GPS Latitude Ref : North
GPS Longitude : 77 deg 0' 36.34" W
GPS Longitude Ref : West
GPS Position : 38 deg 57' 19.00" N, 77 deg 0' 36.34" W
Image Size : 4008x2968
Megapixels : 11.9
Shutter Speed : 1/25
Focal Length : 180.0 mm
Light Value : 9.0

One of the really nice things about ExifTool is that Phil tries to adhere to standards. If the standard says that a field is named “GPS Latitude”, then he calls the field name “GPS Latitude”. This makes it really easy to compare the program’s output with the technical standards. If there’s any ambiguity in a field’s name or purpose, then it is because of the standard and not because of ExifTool.

Not every type of metadata is available in a public standard, and some proprietary metadata structures have been reverse-engineered to identify the purpose. In these cases, ExifTool uses a name that is consistent with the observed purpose.

ExifTool is an extremely common back-end tool for analysis systems. Typically, a simplified user interface calls ExifTool and nicely formats the results. If you have viewed metadata with FotoForensics, NetClean/Griffeye, Flickr, or Jeffrey’s Exif Viewer, then you are actually using ExifTool.

Viewer: Exiv2

Another good command-line metadata parser is Exiv2. While it only handles a fraction of the metadata types supported by ExifTool, Exiv2 does an extremely thorough job decoding EXIF, IPTC, and XMP metadata blocks. By default, this program only displays a few metadata fields. However, with a couple of command-line parameters, it can display much more information. He’s the brick wall’s metadata as reported by Exiv2:

Error: Directory Olympus2 with 42499 entries considered invalid; not read.
Exif.Image.ImageDescription Red Brick, Black Wall (Washington, DC)
Exif.Image.Model E-620
Exif.Image.Orientation top, left
Exif.Image.Software GIMP 2.6.8
Exif.Image.DateTime 2010:11:19 09:12:46
Exif.Image.Artist takomabibelot
Exif.Image.Copyright Creative Commons Attribution.
Exif.Image.ExifTag 832
Exif.Photo.ExposureTime 1/25 s
Exif.Photo.FNumber F6.3
Exif.Photo.ExposureProgram Auto
Exif.Photo.ISOSpeedRatings 200
Exif.Photo.ExifVersion 2.21
Exif.Photo.DateTimeOriginal 2010:11:19 09:12:46
Exif.Photo.DateTimeDigitized 2010:11:19 09:12:46
Exif.Photo.ExposureBiasValue -3/10 EV
Exif.Photo.MaxApertureValue F3.5
Exif.Photo.MeteringMode Spot
Exif.Photo.LightSource Unknown
Exif.Photo.Flash No, auto
Exif.Photo.FocalLength 180.0 mm
Exif.Photo.MakerNote (Binary value suppressed)
Exif.MakerNote.Offset 1254
Exif.MakerNote.ByteOrder MM
Exif.Photo.FlashpixVersion 1.00
Exif.Photo.ColorSpace sRGB
Exif.Photo.FileSource Digital still camera
Exif.Photo.CFAPattern 2 0 2 0 2 1 1 0
Exif.Photo.CustomRendered Normal process
Exif.Photo.ExposureMode Manual
Exif.Photo.WhiteBalance Auto
Exif.Photo.DigitalZoomRatio 1.0
Exif.Photo.SceneCaptureType Standard
Exif.Photo.GainControl Low gain up
Exif.Photo.Contrast Normal
Exif.Photo.Saturation Normal
Exif.Photo.Sharpness Normal
Exif.Image.GPSTag 24140
Exif.GPSInfo.GPSLatitudeRef North
Exif.GPSInfo.GPSLatitude 38deg 57' 19.005"
Exif.GPSInfo.GPSLongitudeRef West
Exif.GPSInfo.GPSLongitude 77deg 0' 36.338"
Exif.GPSInfo.GPSAltitudeRef Above sea level
Exif.GPSInfo.GPSAltitude 80.0 m
Exif.GPSInfo.GPSTimeStamp 14:12:46
Exif.GPSInfo.GPSSatellites 0
Exif.GPSInfo.GPSMapDatum WGS-84
Exif.GPSInfo.GPSDateStamp 2010:11:19
Exif.Image.PrintImageMatching (Binary value suppressed)
Iptc.Application2.Keywords geo:lat=38.95527906
Iptc.Application2.Keywords geo:lon=-77.01009393
Iptc.Application2.Keywords geotagged
Iptc.Application2.Byline takomabibelot
Iptc.Application2.City Washington
Iptc.Application2.SubLocation Chillum Station
Iptc.Application2.ProvinceState District of Columbia
Iptc.Application2.CountryCode USA
Iptc.Application2.CountryName United States
Iptc.Application2.Credit takomabibelot
Iptc.Application2.Copyright Creative Commons Attribution.
Iptc.Application2.Caption Red Brick, Black Wall (Washington, DC)
Iptc.Application2.RecordVersion 4
Iptc.Application2.DateCreated 2010-11-19
Iptc.Application2.TimeCreated 09:12:46-05:00
Xmp.iptc.CountryCode USA
Xmp.iptc.CreatorContactInfo type="Struct"
Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiAdrCity Takoma Park
Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiAdrCtry USA
Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiAdrPcode 20910-5107
Xmp.iptc.CreatorContactInfo/Iptc4xmpCore:CiAdrRegion MD
Xmp.iptc.Location Chillum Station
Xmp.dc.creator takomabibelot
Xmp.dc.description lang="x-default" Red Brick, Black Wall (Washington, DC)
Xmp.dc.rights lang="x-default" Creative Commons Attribution.
Xmp.dc.subject geo:lat=38.95527906, geo:lon=-77.01009393, geotagged
Xmp.exif.ColorSpace sRGB
Xmp.exif.Contrast Normal
Xmp.exif.CustomRendered Normal process
Xmp.exif.DateTimeDigitized 2010-11-19T09:12:46-05:00
Xmp.exif.DateTimeOriginal 2010-11-19T09:12:46-05:00
Xmp.exif.DigitalZoomRatio 1/1
Xmp.exif.ExifVersion 2.21
Xmp.exif.ExposureBiasValue -3/10 EV
Xmp.exif.ExposureMode Manual
Xmp.exif.ExposureProgram Auto
Xmp.exif.ExposureTime 1/25
Xmp.exif.FNumber F6.3
Xmp.exif.FileSource Digital still camera
Xmp.exif.FlashpixVersion 1.00
Xmp.exif.FocalLength 180.0 mm
Xmp.exif.GPSAltitude 93441/1168
Xmp.exif.GPSAltitudeRef Above sea level
Xmp.exif.GPSLatitude 38,57.316744N
Xmp.exif.GPSLongitude 77,0.605636W
Xmp.exif.GPSMapDatum WGS-84
Xmp.exif.GPSTimeStamp 2010:11:19 14:12:46
Xmp.exif.GainControl Low gain up
Xmp.exif.ISOSpeedRatings 200
Xmp.exif.LightSource Unknown
Xmp.exif.MaxApertureValue 5357/1482
Xmp.exif.MeteringMode Spot
Xmp.exif.Saturation Normal
Xmp.exif.SceneCaptureType Standard
Xmp.exif.Sharpness Normal
Xmp.exif.UserComment lang="x-default"
Xmp.exif.WhiteBalance Auto
Xmp.photoshop.City Washington
Xmp.photoshop.Country United States
Xmp.photoshop.Credit takomabibelot
Xmp.photoshop.DateCreated 2010-11-19T09:12:46-05:00
Xmp.photoshop.State District of Columbia
Xmp.tiff.ImageDescription lang="x-default" OLYMPUS DIGITAL CAMERA
Xmp.tiff.Model E-620
Xmp.tiff.Orientation top, left
Xmp.tiff.Software GIMP 2.6.8
Xmp.xmp.CreateDate 2010-11-19T09:12:46
Xmp.xmp.ModifyDate 2010-11-19T09:12:46-05:00

Most web-based sites either use ExifTool or Exiv2 for metadata analysis. (If the site doesn’t say what they use, then look at the types of metadata. If there’s only EXIF, IPTC, XMP, and a few supported MakerNotes, then it’s Exiv2. Otherwise, it’s ExifTool. In general, ExifTool is more common than Exiv2.)

Viewer: Adobe’s File Info

Some graphical applications include metadata viewers that are not based on ExifTool or Exiv2. This is where we start getting into problems. Most of these alternate metadata viewers do not display most types of metadata. And worse: they may provide misleading information.

For example, Adobe Photoshop includes a built-in metadata viewer under the “File Info” menu option. This viewer supports XMP, EXIF, IPTC, and a few other formats. It is neither as powerful nor as complete as ExifTool or Exif2. In some cases, it will omit, reformat, or alter information prior to displaying it. Here are two screenshots showing the brick wall picture’s metadata from Adobe Photoshop CS5:

The tabs at the top divides up the types of metadata. However, unknown metadata blocks and unknown metadata fields are not shown. If there are duplicate metadata fields, then it will display one of them (depending on the version of the Adobe software, it’s either the first or last instance). There’s also some data reformatting. For example, Adobe reformatted the timestamp and dropped off the seconds. (To see the seconds, you need to use the ‘Advanced’ tab.)

And then there is the fictional data. The “Camera Data” lists the file source as “DSC”. (DSC is a digital still camera.) Nowhere in this file does the metadata identify this as a DSC file. This is Adobe making an interpretation based on unspecified information. Similarly, there is a “Lens” field listed with no value; Adobe cannot distinguish between undefined and defined as blank.

What is worse is that Adobe will happily update metadata. For example, take a JPEG or PNG that has no XMP metadata, open it in Adobe Photoshop CS5, and then use File Info to view the metadata. Amazingly, Adobe automatically generates a ton of metadata, including information not contained and not derived from the file. For example, Adobe automatically set the TIFF resolution (my installation defaults to 72×72 inches). And the XMP creation, modified, and metadata timestamps are all set to the file’s timestamp on the computer’s file system, even though the test file contains no embedded timestamps. In some cases, Adobe will even update existing metadata fields just by opening the file.

If you want to view metadata as a point of curiosity, then this is a fine tool. However, I strongly recommend against using any Adobe product for actual forensic work. Between Adobe automatically adjusting colors, auto-sharpening JPEGs during saves, incorrectly implementing widely known algorithms, not matching their own documented results, and altering/omitting altering/hiding metadata, I suggest staying away from all Adobe products for any kind of forensic work. (Adobe makes great programs for artists, but creating graphics is not the same as forensic analysis.)

Viewer: Apple’s Preview Inspector

Apple’s Preview contains an “Inspector” to view metadata. This viewer shows an extremely minimal amount of metadata. (What it shows is accurate; it just doesn’t show much. It omits most of the metadata.) As with Adobe’s File Info, the Preview Inspector is good for a precursory glance, but it is not complete enough to be used for any kind of reliable examination.

Viewer: Microsoft’s Photo Properties

As with Apple’s Preview, Microsoft’s Photo Viewer contains a “Properties” function that displays metadata. But unlike Apple, Microsoft’s metadata viewer displays fields that do not exist in the file, and omits fields that do exist.

In this example from the brick wall picture, there are fields like:

  • “Date taken”. There is no metadata field with that name. The EXIF standard defines metadata fields called “Data Time Original” (EXIF code 0x9003 identifies when it was created), “Date Time Digital” (EXIF code 0x9004 for when it was digitized), and “Date Time” (EXIF code 0x0132, refers to the modified date/time), but nothing called “Date taken” in the EXIF standard. Similarly, “Program name” should be the software/firmware version (EXIF code 0x0131). Microsoft arbitrarily renames well-known metadata fields. And it isn’t like they are renaming it to something more understandable. In many cases, calling the firmware version “Program name” can be very misleading.
  • “Title” and “Subject”. I don’t know where these come from. The brick wall picture has this text in three different fields: the “Image Description” field (EXIF code 0x010e), the IPTC metadata caption, and the XMP description. However, none of those are the “Title” or “Subject”. (There is an XMP field called “Subject”, but it contains GPS information and not the “Red Brick” text.) At minimum, this is misrepresenting the metadata. At worse, it is extremely ambiguous and misleading.
  • “Tags”. Those GPS coordinates are either coming from the IPTC keywords or XMP subject fields. But it’s not clear which, or why they call it “tags”.
  • “Date acquired”, “Rating”, and “Image ID”. This file contains no metadata fields by those name. It isn’t like these fields exist with empty values in this file; those fields do not exist in this file. This is Microsoft just making up data fields and hoping that something might fit.

I recommend against using Adobe’s File Info for any forensic work, and Apple’s Preview Inspector isn’t good for any kind of formal examination. However, I don’t think anyone should use Microsoft Photo Viewer Properties for any reason. It is too inaccurate and misleading.

More than meets the eye

When analyzing metadata, it is not enough to just run a program and write down the fields and values that are extracted. You also need to understand what created the data, when it was created, and what the data actually means. It is also important to understand what the tool does, how it works, and how reliable the results are.

Understanding metadata is not an easy task that can be completed with a single push-button solution. As a result, not everyone understands metadata. This even includes people in the computer security field. For example, security expert Bruce Schneier equates metadata with surveillance. (While he first declared that in 2013, he has repeated it often since then.) However, this analogy is fundamentally flawed. Surveillance means capturing information, while metadata is just data about data. You can evaluate metadata without ever doing surveillance, and you can do surveillance without metadata. They are two completely independent concepts.

Similarly, a few years ago, FourAndSix’s Hany Farid made it very clear that he didn’t understand the XMP metadata format, didn’t know the purpose of the “DocumentAncestors” XMP field, and couldn’t even identify a plain-text time stamp in the metadata that clearly identified that the picture was taken in the morning.

Given that some experts have trouble understanding metadata, I can understand why amateur sleuths sometimes jump to the wrong conclusion. Sometimes it’s the tool and sometimes it’s not understanding the data.

February 21, 2016 at 02:20AM
via The Hacker Factor Blog