Hi there.
I want to turn a static website (about 150 pages) into a SS site. I installed SS to test the StaticSiteImporter (latest trunk) and it works - but I don't know if it is as it should be.
When I check "Preview the content that will be extracted" I get many pages listed with the extracted content.
But the content looks like:
<?xml version="1.0" encoding="utf-8"?>
<!-- ra -->
<!DOCTYPE html ....
all HTML code
} catch(err) {}
//]]>
</script><!-- InstanceEnd -->
</body>
</html>
As I understand this feature it should only show me the grabbed content, right?
That is what I defined in mysite/_config.php
StaticImporter::set_url("www.example.com/");
StaticImporter::set_allowed_extensions(array('php','html','jpg','pdf'));
StaticImporter::set_rules(
array(
// Default rules for all other URLs
'conditions' => array(),
'fields' => array(
'Title' => array(
'xpath' => array(
'//h1'
),
'exclusive' => 1
),
'Hierarchy' => array(
'xpath' => array(
'//h2[contains(@class, "location")]/a/@href',
),
'exclusive' => 1
),
'Content' => array(
'xpath' => '//div[contains(@id, "content")]',
'includeMatchedTag' => 0
)
),
'exclusive' => 1
)
);
Targeted website: http://bit.ly/b5yrHV
Is the XPath wrong or can I import?