Writing / parsing XML

From Visual Basic to GNU C, this is the place to talk programming.

Moderators: SecretSquirrel, just brew it!

Writing / parsing XML

Postposted on Thu Dec 30, 2010 10:27 am

Is it technically a bad habit to write XML such that elements that share the same name (and repeat in the document) do not have the same number or name/type of child elements? Here's an example:
Code: Select all
<component>
            <toppos>540</toppos>
            <leftpos>1100</leftpos>
            <width>185</width>
            <height>387</height>
            <imageurl>http://myimages.com/back.jpg</imageurl>
            <fontsize>0</fontsize>
            <rotation>90</rotation>
        </component>
        <component>
            <toppos>440</toppos>
            <leftpos>875</leftpos>
            <width>2500</width>
            <height>575</height>
            <text>Here's some text</text>
            <font>VNI-Helve.ttf</font>
            <fontsize>36</fontsize>
            <fontcolor>000000</fontcolor>
            <rotation>90</rotation>
            <textalign>center</textalign>
        </component>


So here you can see the component element that share the name 'component' but do not have the same number or type of child elements. I know the XML will validate but is it the correct way of doing things? would it be better to use attributes, for example, to describe the component as an image vs text in this case? I'm not an XML expert but this was my understanding.
Corsair 600T | ASUS P8P67 PRO | Intel 2500k @ 4.4Ghz | EVGA 560 TI | G.SKILL Ripjaws Series 8GB | Corsair HX650 650W
steelcity_ballin
Gerbilus Supremus
Silver subscriber
 
 
Posts: 11924
Joined: Mon May 26, 2003 5:55 am
Location: Pittsburgh PA

Re: Writing / parsing XML

Postposted on Thu Dec 30, 2010 12:55 pm

I don't think it makes much difference. It is kind of like having null fields in the database.

Of course, if the XML is intended to be XSLT'ed then there may be a difference, but I don't think it is impossible to do going either way anyway?
Image
The Model M is not for the faint of heart. You either like them or hate them.

Gerbils unite! Fold for UnitedGerbilNation, team 2630.
Flying Fox
Gerbil God
 
Posts: 24576
Joined: Mon May 24, 2004 2:19 am

Re: Writing / parsing XML

Postposted on Thu Dec 30, 2010 1:08 pm

Flying Fox wrote:I don't think it makes much difference. It is kind of like having null fields in the database.

Of course, if the XML is intended to be XSLT'ed then there may be a difference, but I don't think it is impossible to do going either way anyway?


It's not impossible, no. I just felt that the only way in parsing it that I know what components go with what page is by virtue of their processing order, which I didn't like because it could change in the future without my knowledge, meaning a static mapping of variables when I parse this stuff.

Alternatively, I'm going to bind it to a Data Set (.net guy) and go from there.
Corsair 600T | ASUS P8P67 PRO | Intel 2500k @ 4.4Ghz | EVGA 560 TI | G.SKILL Ripjaws Series 8GB | Corsair HX650 650W
steelcity_ballin
Gerbilus Supremus
Silver subscriber
 
 
Posts: 11924
Joined: Mon May 26, 2003 5:55 am
Location: Pittsburgh PA

Re: Writing / parsing XML

Postposted on Fri Dec 31, 2010 12:15 am

steelcity_ballin wrote:
Flying Fox wrote:I don't think it makes much difference. It is kind of like having null fields in the database.

Of course, if the XML is intended to be XSLT'ed then there may be a difference, but I don't think it is impossible to do going either way anyway?


It's not impossible, no. I just felt that the only way in parsing it that I know what components go with what page is by virtue of their processing order, which I didn't like because it could change in the future without my knowledge, meaning a static mapping of variables when I parse this stuff.

Alternatively, I'm going to bind it to a Data Set (.net guy) and go from there.


Isn't that really what a DTD is for, to define the valid contents of an XML data set? You would need to define the superset of all sub elements. In the example you gave, you would need to define both <text> and <imageurl> for example. As FF noted, lack of an entry for an element should result in a null or undefined element. If you are mapping an XML dataset onto a class, the sub elements common to all element definitions would map to data types in the base class. Sub elements not mapped into the base class should map into a derived class. The class constructor would be responsible for initializing each data entry to a known "NULL" value.

To you original question... In my experience, albeit limited, the XML parsers I have used don't care. Regardless, you should always assume the value derived from the XML document is suspect until you can determine otherwise. Validate what comes back from your parser before you use it.

--SS

--SS
SecretSquirrel
Gerbil Jedi
Gold subscriber
 
 
Posts: 1736
Joined: Tue Jan 01, 2002 7:00 pm
Location: The Colony, TX (Dallas suburb)

Re: Writing / parsing XML

Postposted on Fri Dec 31, 2010 1:23 am

steelcity_ballin wrote:
Flying Fox wrote:I don't think it makes much difference. It is kind of like having null fields in the database.

Of course, if the XML is intended to be XSLT'ed then there may be a difference, but I don't think it is impossible to do going either way anyway?


It's not impossible, no. I just felt that the only way in parsing it that I know what components go with what page is by virtue of their processing order, which I didn't like because it could change in the future without my knowledge, meaning a static mapping of variables when I parse this stuff.

Alternatively, I'm going to bind it to a Data Set (.net guy) and go from there.

Processing order? You mean how the nodes are ordered? Assuming you are using a DOM based parser, I would never assume that and check the existence of certain nodes.

That said, however complicated and logical you want your schema really depends on what is the intended use of the XML document. It is akin to database schema design and how normalized you want to organize your data. Strictly speaking your "type specific" nodes are just 0..1 cardinality (hope I am using the term right, I am not a strict academic like SA) so it will validate against a loose schema.

Guessing at your intent, a more "technical" design may look like this:
Code: Select all
<component>
  <toppos>540</toppos>
  <leftpos>1100</leftpos>
  <width>185</width>
  <height>387</height>
  <image>
    <imageurl>http://myimages.com/back.jpg</imageurl>
  </image>
  <fontsize>0</fontsize>
  <rotation>90</rotation>
</component>
<component>
  <toppos>440</toppos>
  <leftpos>875</leftpos>
  <width>2500</width>
  <height>575</height>
  <text>
    <text>Here's some text</text>
    <font>VNI-Helve.ttf</font>
    <fontcolor>000000</fontcolor>
    <textalign>center</textalign>
  </text>
  <fontsize>36</fontsize>
  <rotation>90</rotation>
</component>

You get one more level to deal with, and I don't know the consequence when you pass this to a DataSet. AFAIK the more sophisticated your object relationships are, the tighter your XSD needs to be, so it adds to the complexity when you bind your stuff to the DataSet. I am guessing this is some quick code to generate ASP.NET pages? I think just doing some straight forward (though less elegant) DOM parsing will be the ticket. .NET DOM-based XML parsing and XPath stuff are really easy to use. A lot of day job is still dealing with the COM based msxml3 so yes, I can tell. 8)
Image
The Model M is not for the faint of heart. You either like them or hate them.

Gerbils unite! Fold for UnitedGerbilNation, team 2630.
Flying Fox
Gerbil God
 
Posts: 24576
Joined: Mon May 24, 2004 2:19 am


Return to Developer's Den

Who is online

Users browsing this forum: No registered users and 1 guest