0 item - $0.00
Glad to have you!

XML Aggregrator Types Explained

Posted on March 22, 2016

XML plays a key role in today's data transport since XML can be customized to represent proprietary business processes and to accommodate business data.  In many instances when a business wants to publish XML structure and data, it's required to provide some kind of Schema definitions that identifies and explains the purpose of each "Tags" within the XML and to how the XML can be parsed in order to use the information required from the XML.  XSD (XML Schema Definition)  and DTD are the most commonly used languages to define the tags (elements) and  their data types. Even though the schema provides the definitions of an XML, that alone cannot provide the complete information required to analyze and use an XML.

For Instance,

<Root>
   <Customers>
      <Customer CustomerID="GREAL">
         <CompanyName>Great Lakes Food Market</CompanyName>
         <ContactName>Howard Snyder</ContactName>
         <ContactTitle>Marketing Manager</ContactTitle>
         <Phone>(503) 555-7555</Phone>
         <FullAddress>
            <Address>2732 Baker Blvd.</Address>
            <City>Eugene</City>
            <Region>OR</Region>
            <PostalCode>97403</PostalCode>
            <Country>USA</Country>
         </FullAddress>
      </Customer>
   </Customers>
   <Orders>
      <Order OrderNumber="99503">
         <CustomerID>GREAL</CustomerID>
         <EmployeeID>6</EmployeeID>
         <OrderDate>1997-05-06T00:00:00</OrderDate>
         <RequiredDate>1997-05-20T00:00:00</RequiredDate>         
   <Item PartNumber="872-AA">
      <ProductName>Lawnmower</ProductName>
      <Quantity>1</Quantity>
      <USPrice>148.95</USPrice>
      <Comment>Confirm this is electric</Comment>
   </Item>
   <Item PartNumber="926-AA">
      <ProductName>Baby Monitor</ProductName>
      <Quantity>2</Quantity>
      <USPrice>39.98</USPrice>
      <ShipDate>1999-05-21</ShipDate>
   </Item>
      </Order>      
   </Orders>
</Root>

 

This is a sample XML which contains information about Customers and their Orders when a business intends to use this data within the XML they have to first find a mechanism to use the attributes and parent element's "KEY"  value as a part of the child element. In this example, if they want to construct a data set of "Order Items" first they have to extract , map and link the "OrderNumber - attribute"  as the primary index for the "Order Items" . When we take a closer look at the XML sample.

<Order OrderNumber="99503">
         <CustomerID>GREAL</CustomerID>
         <EmployeeID>6</EmployeeID>
         <OrderDate>1997-05-06T00:00:00</OrderDate>
         <RequiredDate>1997-05-20T00:00:00</RequiredDate>
   <Item PartNumber="872-AA"> 
     <ProductName>Lawnmower</ProductName>
      <Quantity>1</Quantity>
      <USPrice>148.95</USPrice>
      <Comment>Confirm this is electric</Comment>
   </Item>
   <Item PartNumber="926-AA">
      <ProductName>Baby Monitor</ProductName>
      <Quantity>2</Quantity>
      <USPrice>39.98</USPrice>
      <ShipDate>1999-05-21</ShipDate>
   </Item>
      </Order>     

One can have multiple  ways to use the data , to define a data set "Order Items"  that uses the attribute "OrderNumber"  and the parent element "OrderDate" from the Parent Tag of the "Item" Tag / Element  or to define a  dataset that has redundant rows of Orders and Items combined for each "Item" Tag. This logic of applying "OrderNumber"  from the immediate parent Element to the "Order Items" dataset might become obsolete if the format of the XML is changed to the one below.

 

<Order OrderNumber="99503">
         <CustomerID>GREAL</CustomerID>
         <EmployeeID>6</EmployeeID>
         <OrderDate>1997-05-06T00:00:00</OrderDate>
         <RequiredDate>1997-05-20T00:00:00</RequiredDate>
	<ITEMS>
		<Item PartNumber="872-AA"> 
		 <ProductName>Lawnmower</ProductName>
		  <Quantity>1</Quantity>
		  <USPrice>148.95</USPrice>
		  <Comment>Confirm this is electric</Comment>
		</Item>
   </ITEMS>
</Order>     
   

 In this case, the immediate parent of "Item" is "ITEMS"  and that is not providing any key constraint parameters to define "Order Items" dataset and to maintain the relationship with "Orders" dataset.

The various approaches to "AGGREGATE" an XML is the most critical information required to analyze , extract and use an XML. The logic behind every XML aggregation involves both the owner and the consumer of the XML.  In the domain of Data Integration, one should have a clear understanding of how an XML is formed and how the elements are related to each other , as well an in-depth knowledge of the various datasets that are going to be  a repository to complete the mapping of XML into another System.


DIAL provides many "XML Aggregation" mechanisms that will help the users performing Data Manipulation & Integration, to easily identify the Aggregation Logic that should be used to achieve an optimum XML-based Integration.

Sample XML considered

 

<DIAL>
   <PurchaseOrder TYPE="CONTRACT">
      <PO_ACCOUNT>OPPILLA</PO_ACCOUNT>
      <Supplier>
         <SupplierCode>9999</SupplierCode>
      </Supplier>
      <OrderHeader>
         <OrderType>ST</OrderType>
         <OrderNumber>00099999</OrderNumber>
         <OrderDate>2004-04-26T00:00:00</OrderDate>
         <DeliveryLocation>MAIN WAREHOUSE</DeliveryLocation>
         <OrderTerms>NETT 45</OrderTerms>
      </OrderHeader>
      <OrderLines>
         <Line>
            <LineNumber>0001</LineNumber>
            <ItemNumber>567890</ItemNumber>
            <Quantity>10</Quantity>
            <UnitPrice>25000</UnitPrice>
         </Line>
         <Line>
            <LineNumber>0002</LineNumber>
            <ItemNumber>232345</ItemNumber>
            <Quantity>5</Quantity>
            <UnitPrice>30000</UnitPrice>
         </Line>
      </OrderLines>
      <OrderMisc>
         <FreightTotal>0.00</FreightTotal>
         <OrderTotal>400000</OrderTotal>
      </OrderMisc>
   </PurchaseOrder>
</DIAL>

DIAL XML Aggregator 1 :
    Dial XMl Aggregator Type 1 , extracts every tag and assigns / references them within their Parent. Each data container is an Element  / Tag that has child elements rather a raw text, and all its child elements which contain raw texts is assigned or referenced within this data container. Moreover, the attribute of its parent container is also referenced within it.
    In the sample XML considered,  5 Data Containers will be formed namely "PurchaseOrder", "Supplier", "OrderHeader", "Line", and "OrderMisc". Each data container will contain the "Element/Tag  Name " and the "Raw XML Data" of its child elements, the "OrderHeader" will contain "OrderType", "OrderNumber", "OrderDate", "DeliveryLocation", and "OrderTerms".  Each Data container will also contain the Attribute(s)  of its Parent container, in the example ,since  "OrderHeader", "Supplier", "OrderHeader", and "OrderMisc" are all child Element of a Parent Container "PurchaseOrder" , the Attribute "TYPE" that belongs to "PurchaseOrder"  will be applied or referenced within "OrderHeader", "Supplier", "OrderHeader", and "OrderMisc". Please note that since the parent of "Line" container is "OrderLines"  which doesn't have any Attribute(s) , unlike other data container the "Line" will not carry any Attribute(s) from its parent  in this instance the "TYPE" will not be a part of "Line" container.
 

Aggregation Rules

  1. Data Containers will be formed for XML Element / Tag that has Child element(s) instead of a raw text.
  2. Will Reference all the Child Element(s) that contains raw text data into its Parent container.
  3. Will Extract the Parent container's attribute(s) and references them into its Childs' container.

Write Your Comment

Only registered users can write comments. Please, log in or register

Ready to try the most SIMPLE & POWERFUL data manipulation tool?

SHOP NOW!
Scroll to Top