Time is Money: Speed up your data processing!

XSM sorts 1 Gigabytes in 40 20 secondes, 10 Giga in 5 4 minutes, on a less than $500 machine!
Who does better ?

Thank you for allowing me to evaluate your excellent product. In terms of performance it was clearly the best product available on the market. I bench-tested it against a number of other sort utilities and it performed at a minimum of 2.5x the speed of the other programs.
I am therefore in a position to recognise high quality, and your software is amazing.

- Lindsay Mitchell - Financial Systems Consultant

Your challenges:

  • Fast Sort, Merge, Split, Filter, Deduplicate big data files over 100 GB on affordable machines
  • Process daily and quickly a huge amount of new data
  • Increase your processing capacity by reducing from 20 hours down to 4 hours processing time
  • Speed up your Datawarehouse ETL/EDI data exchanges
  • Optimize your software costs by moving from a really more expensive challenger software

The solution:

  • XSM, already chosen by +60 clients over 12 countries
  • XSM has outstanding performance through its proprietary multi-threading technology, making the most of nowadays multi-processor architectures both on High-end IBM, SUN, HP machines as well as on affordable PCs with multi-core processor.
  • XSM has powerful features that meet classic Sort / Merge / Split / Filter operations necessary for DataWareHousing and DataMining.
    With evolution of Information Technologies (storage capacity, CPU power), data volumes have exploded in last 10 years. We cannot only rely on CPU power: Software performances are essential!

But why use an external sort ... ??

1. External sorting for database loading

Suppose you have to load every night heavy data files into your favorite database Oracle, DB/2, MySQL, SQL-Server, Informix, Sybase, ...

In this example, we use MySQL, which is pretty fast in data loading.

Suppose you have a strongly indexed table, each night you DELETE the table's content and then reload new data into this empty table,

  • if your input data is not sorted, your database server will have to do the job
  • if your input data is pre-sorted, your database server will just have to load, with no work to build indexes.

Now, let's have a look at our benchmark:

  • Input is an ASCII text file, 100 MegaBytes, 1023009 records
  • Records are variable length text, tab separated, 5 columns : 2 integers, 3 strings
  • SQL engine is MySQL Server 4.0.10-gamma running Linux 2.4.18 on Pentium II/550Mhz 512MB RAM (whatever the RDBMS, phenomenon is identical)
  • Chart shows process total elapse time in seconds:

Database Loading with unsorted data:

  • data loading : 10200 secs
  • total : 10200 secs = 2 heures 50 minutes 28 secs.

Database Loading with sorted data:

  • pre-sort : 315 secs. (using XSM V5.08)
  • data loading : 213 secs
  • total : 528 secs. = 8 minutes 48 secs. roughly 20 times faster!

Now, you clearly understand the use of external sorting : Pre-sorting is necessary to speed-up huge data processing.

Don't let your "I can do everything!" database engine do it : it is not its job !

2. Merge / Split / Filter / Selective copy / Identify & Remove of duplicate records

You need to merge / split / filter / copy data according to given criteria.

Let's take a trivial example: You daily receive your Sales Report composed of 50 files and you wish to split the data per Zip Code, creating one distinct file per Zip code.

Two solutions :

  1. Use your RDBMS: most folks would go for this option, but it's not the good one!
    • Drop / Create table : 30 seconds
    • Load 50 files into table : 1 hour
    • Run a deduplicate SQL job : 1 hour
    • Run a hundred of UNLOAD jobs, one per Zip Code : 2 hours

    Estimated time: 4 hours

  2. Use XSM as batch external sort/merge : The good option!
    • In one sole operation, XSM does merge / sort / deduplicate / selective split

    Eclatement selectif

    Estimated time: 5 minutes

Rather than a long marketing speech, just read clients feedback then download and evaluate freely XSM by yourself!