Php domdocument get node html




Solution:

$dom

= new DomDocument();
$dom->prevservWhiteSpace = false;

if (!@

$dom->load("example.xml")) {
    echo
"example.xml doesn't exist!\n";
    return;
}
$imageList = $dom->getElementsByTagName('file');
$imageCnt  = $imageList->length;

for (

$idx = 0; $idx < $imageCnt; $idx++) {
    print
$imageList->item($idx)->nodeValue . "\n";
}
?>

The above PHP code could easily be turned into a function that returns an array of image filenames, an integer value relative to the number of images found, etc.

Hope this is helpful.

jason at shaped dot ca

14 years ago

in response to tildy at pr dot hu

my preferred way is (in example to gather country data from an iso 3166 xml flie)

$countries = new DOMDocument();
$countries->load("./lib/iso_3166.xml");

$countriesList = $countries->getElementsByTagName("ISO_3166-1_Entry");

foreach($countriesList as $country) {
    $values = $country->getElementsByTagName("*");
    foreach($values as $node) {
      echo $node->nodeName."=".$node->nodeValue;
    }
}

Description of the current situation:

I have a folder full of pages (pages-folder), each page inside that folder has (among other things) a div with id="short-info".
I have a code that pulls all the

...
from that folder and displays the text inside it by using textContent (which is for this purpose the same as nodeValue)

The code that loads the divs:

loadHTMLFile($filenamein);
    $xpath = new DOMXpath($doc);
    $elements = $xpath->query("*//div[@id='short-info']");

        foreach ($elements as $element) {
            $nodes = $element->childNodes;
            foreach ($nodes as $node) {
                echo $node->textContent;
            }
        }
}
?>

Now the problem is that if the page I am loading has a child, like an image:

Php domdocument get node html
Hello world
, the output will only be Hello world rather than the image and then Hello world.

Question:

How do I make the code display the full html inside the div id="short-info" including for instance that image rather than just the text?

(PHP 5, PHP 7, PHP 8)

DOMDocument::getElementsByTagNameSearches for all elements with given local tag name

Description

public DOMDocument::getElementsByTagName(string $qualifiedName): DOMNodeList

Parameters

qualifiedName

The local name (without namespace) of the tag to match on. The special value * matches all tags.

Return Values

A new DOMNodeList object containing all the matched elements.

Examples

Example #1 Basic Usage Example

$xml = <<< XML


 Patterns of Enterprise Application Architecture
 Design Patterns: Elements of Reusable Software Design
 Clean Code

XML;$dom = new DOMDocument;
$dom->loadXML($xml);
$books $dom->getElementsByTagName('book');
foreach (
$books as $book) {
    echo 
$book->nodeValuePHP_EOL;
}
?>

The above example will output:

Patterns of Enterprise Application Architecture
Design Patterns: Elements of Reusable Software Design
Clean Code

See Also

  • DOMDocument::getElementsByTagNameNS() - Searches for all elements with given tag name in specified namespace

James L

14 years ago

Return if there are no matches is an empty DOMNodeList. Check using length property, e.g.:

$nodes=$domDocument->getElementsByTagName('book') ;
if (
$nodes->length==0) {
  
// no results
}
?>

Philip N

12 years ago

Note that when using getElementsByTagName that it is a dynamic list. Thus if you have code which adjusts the DOM structure it will change the results of the getElementsByTagName results list.

The following code iterates through a complete set of results and changes them all to a new tag:

$nodes = $xml->getElementsByTagName ("oldtag");$nodeListLength = $nodes->length; // this value will also change
for ($i = 0; $i < $nodeListLength; $i ++)
{
   
$node = $nodes->item(0);// some code to change the tag name from "oldtag" to something else
    // e.g. encrypting a tag element
}
?>

Since the list is dynamically updating, $nodes->item(0) is the next "unchanged" tag.

gurmukh24 at gmail dot com

13 years ago

Following Example is of multiple attributes and multiple child nodes. this is being used to make joomla plugin for bulk upload of articles. Gurmukh Singh Bhatti

$xml =<<


 
  
  

any html code here


   my name is so so
   

   value2
   value3
   value4
 

   
   Test value
 



 
   value1
   value2
 

   
    value4
  



EOT;$dom = new DomDocument;
$dom->preserveWhiteSpace = FALSE;
$dom->loadXML($xml);
$params = $dom->getElementsByTagName('section'); // Find Sections
$k=0;
foreach (
$params as $param) //go to each section 1 by 1
{
         echo
"Section Attribute :-> ".$params->item($k)->getAttribute('name')."
"
;   //get section attribute           
        
$params2 = $params->item($k)->getElementsByTagName('category'); //digg categories with in Section
     
$i=0; // values is used to iterate categories 
       
foreach ($params2 as $p) {
           echo
"  - Category Attribute Name :-> ".$params2->item($i)->getAttribute('name')."
"
; //get Category attributes
           
$params3 = $params2->item($i)->getElementsByTagName('arti'); //dig Arti into Categories
                
$j=0;//values used to interate Arti
                    
foreach ($params3 as $p2)
                   {
                    echo
"   - Article Attribute Name : ".$params3->item($j)->getAttribute('name').""; //get arti atributes
echo "   Value : ".$params3->item($j)->nodeValue."
"
; //get Node value ;
                             
$j++;  
                   }             
        
$i++;
      }
$k++;   

          }

?>

output :
Section Attribute :-> Section1
  - Category Attribute Name :-> google
            - Article Attribute Name : article1   Value : any html code heremy name is so so
            - Article Attribute Name : article2   Value : value2
            - Article Attribute Name : article3   Value : value3
            - Article Attribute Name : article4   Value : value4
  - Category Attribute Name :-> yahoo
            - Article Attribute Name : articleSection2   Value : Test value
Section Attribute :-> Section2
  - Category Attribute Name :-> msn
            - Article Attribute Name : article2   Value : value1
            - Article Attribute Name : article3   Value : value2
  - Category Attribute Name :-> webcare
            - Article Attribute Name : param3   Value : value4

calvin at g mail

11 years ago

My first post!
And this is how I get elements by attribute and its value.
For an example, if I want to grab all DIV tags with class name 'className', then...

$some_link = 'some website';
$tagName = 'div';
$attrName = 'class';
$attrValue = 'className';$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
@
$dom->loadHTMLFile($some_link);$html = getTags( $dom, $tagName, $attrName, $attrValue );
echo
$html;

function

getTags( $dom, $tagName, $attrName, $attrValue ){
   
$html = '';
   
$domxpath = new DOMXPath($dom);
   
$newDom = new DOMDocument;
   
$newDom->formatOutput = true;$filtered = $domxpath->query("//$tagName" . '[@' . $attrName . "='$attrValue']");
   
// $filtered =  $domxpath->query('//div[@class="className"]');
    // '//' when you don't know 'absolute' path

    // since above returns DomNodeList Object
    // I use following routine to convert it to string(html); copied it from someone's post in this site. Thank you.

$i = 0;
    while(
$myItem = $filtered->item($i++) ){
       
$node = $newDom->importNode( $myItem, true );    // import node
       
$newDom->appendChild($node);                    // append node
   
}
   
$html = $newDom->saveHTML();
    return
$html;
}
?>

Please, improve it, and share it.

gem at rellim dot com

17 years ago

Here is an example of getElementsByTagName():

$xml =<<

 


   value1
   value2
 

 

   value3
 


EOT;$dom = new DomDocument;
$dom->preserveWhiteSpace = FALSE;
$dom->loadXML($xml);
$params = $dom->getElementsByTagName('param');

foreach (

$params as $param) {
       echo
$param -> getAttribute('name').'
'
;
}
?>

Expected result:
--------------
param1
param2
param3

yaakov dot moddel at gmail dot com

10 years ago

This is a very simplistic way to traverse xml nodes and childnodes using the DOMDocument class

$xml ='

   
        child 1
        child 2
        child 3
       
            Grandchild 1
            Grandchild 2
            Grandchild 3
       

   

   
        child 4
        child 5
        child 6
       
            Grandchild 4
            Grandchild 5
            Grandchild 6
       

   

'
;
     
$doc = new DOMDocument();
     
$doc->preserveWhiteSpace = false;
     
$doc->loadXML($xml);
     
$i=0;
            while(
is_object($finance = $doc->getElementsByTagName("Parent")->item($i)))
            {
                     foreach(
$finance->childNodes as $nodename)
                     {
                             if(
$nodename->nodeName=='subParent')
                                {
                                     foreach(
$nodename->childNodes as $subNodes)
                                     {
                                     echo
$subNodes->nodeName." - ".$subNodes->nodeValue."
"
;
                                     }
                                }
                             else
                                {
                                echo
$nodename->nodeName." - ".$nodename->nodeValue."
"
;
                                }
                     }
     
$i++;
            }
?>

simon at onepointltd dot com

9 years ago

Here is a function to convert the populated tables in and HTML document into an array.

//Create an array containing all the populated tables in an HTML page
//Returned array: tables_to_array[table number][row number][column number]
function tables_to_array ($url) {
 
$htmlDocDom = new DOMDocument();

  @

$htmlDocDom->loadHTMLFile($url);
 
$htmlDocDom->preserveWhiteSpace = false;
 
$tableCounter = 0;
 
$htmlDocTableArray = array();
 
$htmlDocTables = $htmlDocDom->getElementsByTagName('table');
  foreach (
$htmlDocTables as $htmlDocTable) {
   
$htmlDocTableArray[$tableCounter] = array();
   
$htmlDocRows= $htmlDocTable->getElementsByTagName('tr');
   
$htmlDocRowCount = 0;
   
$htmlDocTableArray[$tableCounter] = array();
    foreach (
$htmlDocRows as $htmlDocRow) {
      if (
strlen($htmlDocRow->nodeValue) > 1)
      {
       
$htmlDocColCount = 0;
       
$htmlDocTableArray[$tableCounter][$htmlDocRowCount] = array();
       
$htmlDocCols = $htmlDocRow->getElementsByTagName('td');
        foreach (
$htmlDocCols as $htmlDocCol) {
         
$htmlDocTableArray[$tableCounter][$htmlDocRowCount][] = $htmlDocCol->nodeValue;
         
$htmlDocColCount++;
        }
       
$htmlDocRowCount++;
      }
    }
    if (
$htmlDocRowCount > 1) $tableCounter++;
  }
  return(
$htmlDocTableArray);
}
?>

Finding values of a node

15 years ago

I don't know if this is that obvious but it was not for me, so in addition to gem at rellim dot com's posting:
adding

echo $param -> nodeValue.'
'
;
?>

to the loop will output
value1
value2
value3

rsvvuuren at hotmail dot com

8 years ago

I was in need of $dom->getElementsByTagName that would hist magic within a contextNode.

Why i needed getElementsByTagName instead of just simple using an xPath->query is because while looping through the returned nodelist, more nodes with the tagName i was looking for where created.

When using getElementsByTagName, the new nodes are "added" to the nodelist i already am looping through.

When using an xpath query, you wil only loop through the original nodelist, the newly created elements wil not appear in that nodelist.

I was already using an extended class on domDocument, so it was simple to create an kind of getElementsByTagName that would accept an contextNode.

class SmartDocument extends DOMDocument {
    private
$localDom;
    public
$xpath;
    private
$serialize = array('localDom');

        private

$elemName;
    private
$elemCounter;/**
     * Constructor
     */
   
function __construct() {
       
parent::__construct ( '1.0', 'UTF-8' );
       
$this->preserveWhiteSpace = false;
       
$this->recover = TRUE;
       
$this->xpath = new DOMXpath ( $this );
    }
/**
     * GetElementsByTagname within an contextNode
     *
     * @param string $name
     * @param DomNode $contextNode
     * @return DOMNode|NULL
     */
   
public function getElementsByTagNameContext($name, $contextNode) {

                if(

$this->elemName!=$name) {
           
$this->elemCounter     = 0;
           
$this->elemName        =$name;
        }
$this->elemLength      = $this->xpath->evaluate('count(.//*[name()="'.$this->elemName.'"])', $contextNode);
        while(
$this->elemCounter < $this->elemLength) {
           
$this->elemCounter++;
           
$nl = $this->xpath->query('.//*[name()="'.$this->elemName.'"]['.$this->elemCounter.']', $contextNode);
            if(
$nl->length == 1) {
                return
$nl->item(0);
            }   
        }
$this->elemLength      = null;
       
$this->elemCounter     = null;
       
$this->elemName        = null;       
        return
null;
    }
}
?>

Usage:

$doc

= new SmartDocument();
$doc->load('book.xml');$nl = $doc->query('//books');
foreach(
$nl as $node) {
  while(
$book = $doc->getElementsByTagNameContext('book', $node)) {
  
//When you now create new nodes within this loop as child or following-sibling of this node
  // They show up within this loop
 
}

}

?>

metron at underhive-planet dot com

11 years ago

My first try to get a stable solution with this function, failed with this Exception:

"Fatal error: Call to undefined method DOMNodeList::getElementsByTagName()"
This is the xml snipplet:



 
    ....
 


So the php code to climb along the elements is:

$src

= new DOMDocument('1.0', 'utf-8');
$src->formatOutput = true;
$src->preserveWhiteSpace = false;//Loading extern file
$src->load('../xml/Item_component.xml');//Check each child of first indexed branch node of:
//First get element after root element:

//1st level

$component = $src->getElementsByTagName('component')->item(0);//2nd level, get next element after component, here it fails!!
$properties = $component->getElementsByTagName('properties')->item(0);
...
?>

I realised, that there is a different when using different libxml2 versions on Apache2. This code will fail with libxml2 version 2.6.23 and PHP version 5.2.6
--
->It works fine with libxml2 version 2.6.32 and PHP version 5.2.6-3ubuntu4.6
->...and finally it also works with libxml2 2.7.7 and PHP >= 5.3

So if you bored to search for solutions with DOM like I did, please ensure that your www environment has the correct libxml2 / PHP Version installed on your apache2 server.

Mateusz K

2 years ago

If you want to move all nodes to another tag you can do it like that
Eg: replace

with nodes to

with the same nodes.

function replaceDomElementTag(DOMDocument $dom, DOMElement $node, string $tagName)
{
        $newElement = $dom->createElement($tagName);
        while ($node->childNodes->length)
            $newElement->appendChild($node->childNodes[0]);

        $node->parentNode->replaceChild($newElement, $node);
}

$dom = new DOMDocument();
replaceDomElementTag($dom, $divElementToReplace, 'p');

Marco Maranao

12 years ago

The following takes a list of news items from an XML file (or an RSS feed), assigning it to an array first for a name value pair and then generating an HTML list.

$xml

=<<

   
        News 1
        04/2/2010 08:00 EST
        http://news.example.com/news.pdf
   

   
        News 2
        04/25/2010 08:00 EST
        http://news.example.com/news.pdf
   

   
        News 3
        04/27/2010 08:00 EST
        http://news.example.com/news.pdf
   


EOT; $doc = new DOMDocument();

if (

$doc->loadXML($xml)) {
   
$items = $doc->getElementsByTagName('item');
   
$headlines = array();

        foreach(

$items as $item) {
       
$headline = array();

                if(

$item->childNodes->length) {
            foreach(
$item->childNodes as $i) {
               
$headline[$i->nodeName] = $i->nodeValue;
            }
        }
$headlines[] = $headline;
    }

        if(!empty(

$headlines)) {
       
$hc = 0;

                echo

'';
    }
}
?>

james

12 years ago

Problem:
You have an XML document that contains filename references to, say, images. Each filename reference is defined by filename.ext tag. You'd like to perform perform additional validation, say, after running the XML document through XSD validation. The additional validation can be any of your choice, in this example, it would be ideal to convert the PHP code to a function. The function would then determine if the images exist and return either an integer value or a boolean.




example.png



example2.png

The above image is an example