Xponent logo Xponent Specialists in Large XML Documents Contact

XmlSplit: Split Any Size XML Using Command-line, Script or Wizard Dialog.

Fast, Flexible XML Splitter
XmlSplitTM provides several methods that split XML of any size into multiple, smaller, well-formed XML files. Numerous parameters give control over where to split the XML. Includes two programs:
  • A command-line XML splitter run from script or command prompt.
  • A wizard that generates scripts from selected options or directly splits XML.
XmlSplit is ideal for dividing multi-gigabyte XML files for database import, ETL scenarios or any process that requires smaller XML files. Read how a customer split a 113 GB XML

"XMLSplit has saved me immense amounts of time in working with large, unruly XML files. The interface is easy to use, and the program itself is quick and concise. All I can say is XMLSplit is awesome!" 
Jason Descamps, Chief Information Officer, Marisol International, Springfield, MO USA. More testimonials

Check out these detailed examples.

What's new in the current release.

    split xml wizard

Wizard screen shots and details

Download And Test It Today

Try our products and test the performance yourself. No XML is too big for our tools.
No file size limits in trial.

 
download button     buy button $99.00 US
Multiple Split Methods

Split Into Files of a Specified Size.

After the specified number of bytes have been written, the split occurs at the next element that will result in a well-formed XML file.

Split Every nth Element

The splitter creates a new split file every nth element at the specified depth.

Split When An Element Name Changes

Creates a new split file when the name of an element at the specifed depth changes.

Split When The Value Of Specified Attribute Changes

The splitter creates a new split file upon change in value of the attribute in an element at the specifed depth.

Split When Namespace Changes

Creates a new split file when the namespace in scope changes.

Split When a Comment, CDATA or ProcessingInstruction Occurs

Accepts a list containing any of these node types and creates a new split file when one of the listed node types occurs and optionally contains specified text.

Useful Options

Preserve Structure.Creates split files having the same structure as the source XML. Helps ensure XML schema validity.

Preview Mode. The Wizard creates and displays the first split file only.

Header Element. Includes the first element under the root (header) in each split file. Read our blog article...

Depth. Specifies the element depth in the XML hierarchy for inclusion.

Root Element. Encapsulates each split file with the specified root. If it has attributes, it automatically handles the quotes so the entire root is properly quoted for the script engine.

Include File. The specified file is inserted in each split file. One use is to ensure each split file has the same structure as the source XML.

Append File. Inserts the specified file at the end of each split file. When used with an Include File, each split file may be nested within multiple parent elements.

Threshold Element. Specifies the element in the source file at which the splitter begins processing, skipping over all preceding nodes.

Encoding. Specifies the encoding used to write the split files. utf-8, utf-16 and iso-8859-1 are currently supported.

Write Byte Order Mark. Specifies whether the splitter writes a byte order mark in each split file. This is useful when feeding the split files into other software that may either require it, or fire an exception if it occurs.

Write DOCTYPE. If a DOCTYPE node occurs, specifies if the splitter writes it in each split file, first only, or none. This is useful where a DTD containing named entities may not be available or needed.

How Does It Work?

The XmlSplit Wizard generates Powershell and Windows Script Host scripts that run XmlSplit in command-line mode. The Wizard can also split the XML directly from its dialog where it reports progress and enables cancellation at any time.

XmlSplit uses an XmlReader to read and parse the input XML document. It evaluates the input parameters when each node is read to determine if the node is to be written to the current split file or a new split file created. Auto-numbered split files are named based on an output file parameter. It automatically handles file names with spaces by encapsulating them with quotes.

Many XmlSplit customers receive large XML files electronically and need to split and import them into database tables. Calling an XmlSplit script from another script allows the entire process to be fully automated.

Rate of execution is constant with respect to size of the file being split because only a small segment is read into memory.

If the source XML file has XML syntax errors, or characters not allowed in XML files, Xmlsplit will report the error and stop processing. We recommend our XMLMax editor to fix such errors, particularly if the XML is very large.


What's New

Current Version: 2.8

New tabbed interface makes the program easier to use than ever.

The size and encoding of the source XML document was added to the status strip

Fixed Powershell scripts that were incomplete.

The split by size method has a new input for the number of split files and the program automatically estimates the input for their approximate file size. The input for element name is automatically filled with the name of the first repeating element.

When the xmlsplit.exe command line program returns a non-zero exit code, the exit code and command line arguments are appended to a log file named xmlsplit error log.txt in the user's local applicationdata folder. Each entry is date and time stamped.

System Requirements

Any Windows Operating System From XP to the most recent.

One gigabyte of random access memory is recommended.




copyright © 2008-2014. Xponent LLC. All rights reserved.