Parsing unusual XML file

edited March 2015 in How To...

Hi there,

I am trying to parse a BeerXML file (specification can be found under, which is kind of unusual. The structure looks more or less like below.

<?xml version="1.0" encoding="ISO-8859-1"?>
     <TYPE>All Grain</TYPE>

And so on. Despite wrong encoding (I will take care of this myself), here are many children, while built-in XML parser reads only one parent - "RECIPES" and only one child - "RECIPE". The XML file is automatically generated by BeerSmith software, so I do not have any possibility to modify the output. Is there any way to properly extract e.g. all values of "NAME" for all found "HOP" within "HOPS"?


  • I don't see how the XML is "unusual". Looks OK to me. Processing has the loadXML() method, have you looked at it?

  • edited March 2015

    Like @PhiLho said, that's a valid XML file: :-\"

    XML beer = loadXML("beer.xml");

    However, if you're only looking for NAME values inside HOP entries, a custom parser can be written too: :-bd

    void setup() {
      String[] beers = loadStrings("beer.xml");
      String[] names = findHopNames(beers);
    static final String[] findHopNames(String[] arr) {
      if (arr == null || arr.length == 0)  return new String[0];
      StringList sl = new StringList();
      boolean hopFlag = false;
      for (String s : arr) {
        if (s == null)  continue;
        if      (s.contains("<HOP>"))   hopFlag = true;
        else if (s.contains("</HOP>"))  hopFlag = false;
        else if (hopFlag) {
          String name = extractHopName(s);
          if (name != null) {
            hopFlag = false;
      return sl.array();
    protected static final String extractHopName(String s) {
      int idx = s.indexOf("<NAME>") + 6, end = s.indexOf("</", idx);
      return idx != 5? s.substring(idx, end) : null;
  • Answer ✓

    Bad idea to use a custom parser for XML, in general... As shown in the loadXML page, the loaded XML can be walked to inspect the children, to an arbitrary depth.

  • PhiLho, GoToLoop, thank you both for the contribution.

    GoToLoop, your custom parser is a great solution, but why write new functions, when the default library should deal with a problem right away. :) If we would not figure out what I am doing wrong here, so I cannot parse the file in normal way I will use your solution.

    PhiLho, I am using loadXML() but somehow I am not able to extract information I want. I can only get "RECIPES" as parent and "RECIPE" as child. To write my code I used references from here - . Maybe I just cannot see something that is clear, but I would be grateful if you could show me how to get names of each hop within the recipe.

  • Ok, I have figured it out. Below working code.

    XML xml;
    void setup() {
      xml = loadXML("recipe.xml");
      XML[] recipe = xml.getChildren("RECIPE");
      XML[] hops = recipe[0].getChildren("HOPS");
      XML[] hop = hops[0].getChildren("HOP");
      String[] hop_names = new String[hop.length];
      for (int i = 0; i < hop.length; i++)
        XML[] x = hop[i].getChildren("NAME");
        hop_names[i] = x[0].getContent();
  • Good! Thanks for sharing your solution.
    Pro Tip when asking for help: show your attempt, so that we can see what went wrong... :-) Even better here as you solved the issue by yourself.

Sign In or Register to comment.