Friday, December 30, 2016

Added source for simple console app to dump metadata and content using #TIKA using .NET.

I decided I needed to put out a simple command line program for dumping metadata. It’s been sitting on my todo list for too long.  I’ve been using Tika for a long time now and it’s amazing how many file format’s it supports. The file formats it supports keeps grows with every new release. This is bare bones compared to MetaDiver and is strictly TIKA based.

TIKA supported formats:

There are so many supported format’s I can’t list them all.

I know we already have a lot of programs out there to for parsing metadata from files but most are commercial. Phil Harvey’s Exiftool is a free program that does an amazing job at metadata but you should always have another option. More importantly, each tool has limits to formats. Tika supports constuming exiftool  as of 1.9 to supplement metadata using the Java version! Pretty amazing.

I decided to keep it simple with the 1.0 release. You’ll get the key value pairs from the file metadata and you can also dump the text from the file to the console.

Sample output:

T:\MD_DumpCLI>MD_DumpCLI.exe -f "T:\Test_data\exif\IMG_0581.JPG"
Author:  David Dym
License: Apache 2.0

Filename: N:\Test_data\exif\IMG_0581.JPG

Aperture Value: f/2.8
Brightness Value: 5067/1265
Color Space: sRGB
Component 1: Y component: Quantization table 0, Sampling factors 2 horiz/2 vert
Component 2: Cb component: Quantization table 1, Sampling factors 1 horiz/1 vert
Component 3: Cr component: Quantization table 1, Sampling factors 1 horiz/1 vert
Components Configuration: YCbCr
Compression: JPEG (old-style)
Compression Type: Baseline
Content-Type: image/jpeg
Creation-Date: 2011-10-23T13:55:09

.... (cutoff the other 200+ fields)

Github Page:



by Dave via EasyMetaData