Friday, June 24, 2011

Parsing Multiple XML Tags

How to parse XML with different multiple tags? For instance, I have a root tag element and few child tags with same name. For below is the snippet for the XML I will be providing parsers for
Multiple XML Tags
multiple-tags.xml
<?xml version="1.0" encoding="UTF-8"?>
<rootTag Name="Root">
 <ChildOne activityType="Task">
  <Name>P1</Name>
  <Incoming>0</Incoming>
  <Outgoing>P1P2</Outgoing>
 </ChildOne>
 <ChildOne activityType="Task">
  <Name>P2</Name>
  <Incoming>P1P2</Incoming>
  <Outgoing>P2G1</Outgoing>
 </ChildOne>
 <ChildOne activityType="GatewayParallel">
  <Name>G1</Name>
  <Incoming>P2G1</Incoming>
  <Outgoing>G1P3 G1P4</Outgoing>
 </ChildOne>
 <ChildTwo type="bpmn:SequenceEdge">
  <Name>P1P2</Name>
  <Source>P1</Source>
  <Destination>P2</Destination>
  <Path>P1P2</Path>
 </ChildTwo>
 <ChildTwo type="bpmn:SequenceEdge">
  <Name>P2G1</Name>
  <Source>P2</Source>
  <Destination>G1</Destination>
  <Path></Path>
 </ChildTwo>
 <ChildTwo type="bpmn:SequenceEdge">
  <Name>G1P3</Name>
  <Source>G1</Source>
  <Destination>P3</Destination>
  <Path>P2P3</Path>
 </ChildTwo>
</rootTag>


First, I would create POJOs for the each element I would like to represent in java. Like class ChildOne which has an attribute and some inner tags or child elements as member variables and provide getter and setter for the same. But That would be too much of work for me If I have one more child tag like ChildThree with different sub tags. If I go in a generic way, I see is an object which has attributes. These attributes are name value pairs. An Object has some data in form of sub elements. This again is name value pairs. To represent this in Java I have created a class named com.mbm.demo.xml.parser.Attributes. This class contains a map which stores all the attributes in name value pairs. I can use this class now for the Root tag as well as for ChildOne, ChildTwo and so on. Now ChildOne has got some data in form of child tags. For this, I have a class named com.mbm.demo.xml.parser.AttributesAndValues which is an extension of com.mbm.demo.xml.parser.Attributes. This contains a Map which stores data.
 Now lets write the Handler for this XML.
Handler.java


package com.mbm.demo.xml.parser;

import java.util.Arrays;
import java.util.List;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;


/**
 * @author Mohammed Bin Mahmood
 */
public class Handler extends DefaultHandler {
    private final StringBuilder buffer = new StringBuilder(128);
    // XML tag names
    private final String POOLS = "rootTag";
    private final List<String> CHILD_TAG_NAMES = Arrays.asList("ChildOne", "ChildTwo");
    
    // parent node
    private Root root;
    
    private AttributesAndValues currentChild = null;
    
    @Override
    public void characters(char[] ch, int start, int length) throws SAXException {
        // add characters to the buffer
        buffer.append(ch, start, length);
    }
    
    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        
        if (POOLS.equals(qName)) {
            // create new object.
            root = new Root();
            // populate all
            populateAttributes(root, attributes);
        } else if (CHILD_TAG_NAMES.contains(qName)) {
            currentChild = new AttributesAndValues();
            populateAttributes(currentChild, attributes);
        } else {
            // probably a tag inside child tag.
        }
    }
    
    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {
        if (POOLS.equals(qName)) {
            // do nothing.
        } else if (CHILD_TAG_NAMES.contains(qName)) {
            root.addChild(qName, currentChild);
            currentChild = null;
        } else {
            // probably a tag inside child tag.
            if (currentChild != null) {
                // capture value
                currentChild.addValue(qName, getBufferValue());
            }
        }
        // always clear the buffer
        buffer.setLength(0);
    }
    
    /**
     * Returns the current value of the buffer, or null if it is empty or whitespace. This method
     * also resets the buffer.
     */
    private String getBufferValue() {
        if (buffer.length() == 0)
            return null;
        String value = buffer.toString().trim();
        buffer.setLength(0);
        return value.length() == 0 ? null : value;
    }
    
    public Root getRoot() {
        return root;
    }
    
    // --- UTILITIES ---
    private static void populateAttributes(com.mbm.demo.xml.parser.Attributes attribObject, Attributes attributes) {
        for (int index = 0; index < attributes.getLength(); index++) {
            attribObject.addAttribute(attributes.getQName(0), attributes.getValue(index));
        }
    }
    
}


I would first start with root tag. Create an instance and hold a reference. then for each child of root, prepare class and store the data of all sub elements in side including the attributes.
.........
......
        } else if (CHILD_TAG_NAMES.contains(qName)) {
            currentChild = new AttributesAndValues();
            populateAttributes(currentChild, attributes);
        } else {

.........
.........

        } else if (CHILD_TAG_NAMES.equals(qName)) {
            root.addChild(qName, currentChild);
            currentChild = null;
        } else {
            // probably a tag inside child tag.
            if (currentChild != null) {
                // capture value
                currentChild.addValue(qName, getBufferValue());
            }
        }

........
........
Whole logic lies in the above snippet.At the end you need the com.mbm.demo.xml.parser.Attributes and com.mbm.demo.xml.parser.AttributesAndValues which contain data and methods to populate that data. do find them below.

Attributes.java

package com.mbm.demo.xml.parser;

import java.util.HashMap;
import java.util.Map;
import java.util.Set;


public class Attributes {
    private final Map<String, String> attribs = new HashMap<String, String>(1);
    
    public Map<String, String> getAttribs() {
        return attribs;
    }
    
    public void addAttribute(String name, String value) {
        attribs.put(name, value);
    }
    
    /**
     * Don't call this method repeated times.
     * 
     * @return
     */
    public Set<String> getAttributeNames() {
        return attribs.keySet();
    }
    
}


AttributesAndValues.java
package com.mbm.demo.xml.parser;

import java.util.HashMap;
import java.util.Map;
import java.util.Set;


public class AttributesAndValues extends Attributes {
    private final Map<String, String> values = new HashMap<String, String>(3);
    
    public Map<String, String> getValues() {
        return values;
    }
    
    public void addValue(String name, String value) {
        values.put(name, value);
    }
    
    /**
     * Don't call this method repeated times.
     * 
     * @return
     */
    public Set<String> getValueNames() {
        return values.keySet();
    }
    
}

Root.java
package com.mbm.demo.xml.parser;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;


public class Root extends Attributes {
    private Map<String, List<AttributesAndValues>> childrens = new HashMap<String, List<AttributesAndValues>>();
    
    // --------------getters-----------------
    public Map<String, List<AttributesAndValues>> getChildrens() {
        System.out.println(childrens);
        return childrens;
    }
    
    // ------------ methods to add data -------------------
    public void addChild(String name, AttributesAndValues child) {
        List<AttributesAndValues> _child = childrens.get(name);
        if (_child == null)
            _child = new ArrayList<AttributesAndValues>();
        _child.add(child);
        childrens.put(name, _child);
    }

}


NOTE: The above code will not work when there is nested child tags. Source code example demonstrated above works with XML provided in the example or similar structured XML files.
For using the above Handler see Parser class and the usage Parsing complex XML post . Do share your feedback and comments.


No comments:

Post a Comment

Was this article useful?