Search This Blog

Friday, January 4, 2008

An IPFIX-Based File Format

Members of the IP Flow Information Export (IPFIX) Working Group have
released an initial -00 Internet Draft for "An IPFIX-Based File Format."
The IPFIX WG has developed a MIB module for monitoring IPFIX
implementations. Means for configuring these devices have not been
standardized yet. Per its charter, the WG is developing an XML-based
configuration data model that can be used for configuring IPFIX devices
and for storing, modifying and managing IPFIX configurations parameter
sets; this work is performed in close collaboration with the NETCONF WG.
The IETF Proposed Standard "Information Model for IP Flow Information
Export" defines an XML-based specification of template, abstract data
types and IPFIX Information Elements can be used for automatically
checking syntactical correctness of the specification of IPFIX
Information Elements. The new "IPFIX-Based File Format" document
describes a file format for the storage of flow data based upon the
IPFIX Message format. It proposes a set of requirements for flat-file,
binary flow data file formats, then applies the IPFIX message format
to these requirements to build a new file format. This IPFIX-based file
format is designed to facilitate interoperability and reusability
among a wide variety of flow storage, processing, and analysis tools...

[Note, in relation to W3C's Efficient XML Interchange (EXI) Working Group
Charter and Deliverables:] Over the past decade, XML markup has emerged
as a new 'universal' representation format for structured data. It is
intended to be human-readable; indeed, that is one reason for its rapid
adoption. However XML has limited usefulness for representing network
flow data. Network flow data has a simple, repetitive, non-hierarchical
structure that does not benefit much from XML. An XML representation of
flow data would be an essentially flat list of the attributes and their
values for each flow record. The XML approach to data encoding is very
heavyweight when compared to binary flow encoding. XML's use of start-
and end-tags, and plain-text encoding of the actual values, leads to
significant inefficiency in encoding size. Typical network flow datasets
can contain millions or billions of flows per hour of traffic represented.
Any increase in storage size per record can have dramatic impact on flow
data storage and transfer sizes. While data compression algorithms can
partially remove the redundancy introduced by XML encoding, they
introduce additional overhead of their own. A further problem is that
XML processing tools require a full XML parser... This leads us to
propose the IPFIX Message format as the basis for a new flow data file
format.

No comments: