JSON data format is widely used in modern client/server applications such as Twitter due to its compactness over XML and its data-centric design.
DMExpress can process UTF-8 JSON input data via the JSON Reader task. The output from the JSON reader can subsequently be used in other downstream tasks within a DMExpress job.
The DMExpress JSON Reader task enables JSON data to be treated as a DMExpress source. Specifically, the JSON Reader converts a UTF-8 JSON source into a UTF-8 CSV (comma-separated values) output, which can then be read by other DMExpress tasks.
The attached example contains the following:
Below are the steps to create and run this job.
Open the DMExpress Job Editor on a Windows workstation, click on File->Save Job As… to create a new job, name it TweetSorter.dxj, and save it to the desired folder on the Windows file system.
In this example, the JSON Reader takes the JSON file tweets.txt as its source, and writes the CSV output directly to stdout. Create the JSON Reader task to perform the JSON to CSV conversion as follows:
This layout will be linked in as external metadata in the subsequent task. The populated window should look similar to this:
You can simply add the provided SortTweet.dxt task to the job flow, or you can create the task yourself as follows:
Once the task has been defined, click on the Sequence toolbar button and draw an arrow from the JSON Reader task to the SortTweet.dxt task as shown below:
Define the environment variable $LOCAL_TARGET_DIR to point to a location to store the sorted Twitter data. For example:
LOCAL_TARGET_DIR=C:\Users\dmxdemo\output
Run the job from the command line by navigating to the directory containing the job, and entering:
dmxjob /run TweetSorter.dxj
311_JSON_ReaderSort.zip, compatible with DMExpress version 7.12 or higher
For details on JSON, go to www.json.org.
For additional information on JSON handling using DMExpress, see JSON in the DMExpress Help.
Copyright © 2016 Syncsort All rights reserved.