We’re helping a number of shoppers proper now with Marketo migrations. As giant corporations make the most of enterprise options like this, it’s like a spider net that weaves itself into processes and platforms over years… till the purpose that corporations aren’t even conscious of each touchpoint.
With an enterprise advertising automation platform like Marketo, kinds are the entry level of information all through websites and touchdown pages. Corporations typically have 1000’s of pages and tons of of kinds all through their websites that must be recognized for updating.
An incredible device for that is Screaming Frog’s web optimization Spider… maybe the most well-liked platform within the web optimization marketplace for crawling, auditing, and extracting knowledge from a website. The platform is feature-rich and gives tons of of choices for nearly each process you require. The options lengthen far past optimization for search, although, with one extremely useful characteristic for extracting knowledge out of your website because it’s being crawled.
Screaming Frog web optimization Spider: Crawl And Extract
A key characteristic of Screaming Frog web optimization Spider is that you would be able to carry out customized extractions primarily based on Regex, XPath, or CSSPath specifics. This is available in extraordinarily helpful as we want to crawl the shopper’s websites and audit and seize the MunchkinID and FormId values from pages.
With the device, open Configuration > Customized > Extraction to establish components you want to extract.
The extraction display permits for nearly limitless knowledge assortment:
Regex, XPath, and CSSPath Extraction
For the MunchkinID, the identifier is situated throughout the type script that’s throughout the web page:
<script sort='textual content/javascript' id='marketo-fat-js-extra'>
/* <![CDATA[ */
var marketoFat = {
"id": "123-ABC-456",
"prepopulate": "",
"ajaxurl": "https://yoursite.com/wp-admin/admin-ajax.php",
"popout": {
"enabled": false
}
};
/* ]]> */
We then apply a Regex rule to seize the id from throughout the script tag that’s inserted within the web page:
Regex: ["']id["']: *["'](.*?)["']
For the Type ID, the info is in an enter tag throughout the Marketo type:
<enter sort="hidden" title="formid" class="mktoField mktoFieldDescriptor" worth="1234">
We apply an XPath rule to seize the id from throughout the type that’s inserted within the web page. The XPath question appears for a type with an enter with a reputation of formid, then the extraction saves the worth:
XPath: //type/enter[@name="formid"]/@worth
Extract Inline Model Tags
We’re serving to a shopper proper now clear up a website the place they used inline types on the Elementor plugin to customise nearly each aspect with a web page. To establish the place inline types have been used, we scrapted the positioning with quite a few RegEx guidelines for customized extraction:
<spans+(?:[^>]*?s+)?types*=s*"([^"]*)"
<as+(?:[^>]*?s+)?types*=s*"([^"]*)"
<divs+(?:[^>]*?s+)?types*=s*"([^"]*)"
- Heading Tag Inline Model:
<h+(?:[^>]*?s+)?types*=s*"([^"]*)"
Exclude Subdomains In Your Crawl
At Martech Zone, we serve the positioning in a number of languages at completely different subdomains. Crawling these translations isn’t mandatory since all of the belongings and data relies on the core website. Due to this, we enabled the Exclude Listing Configuration and added the next rule:
.*.martech.zone
You can even use this to skip crawling pointless paths like tags by including:
martech.zone/tag/.*
The platform even has a pleasant technique to check some URLs in opposition to the foundations to make sure it really works correctly earlier than you crawl your website.
Screaming Frog web optimization Spider Javascript Rendering
One other nice possibility of Screaming Frog is that you simply aren’t restricted to the HTML within the web page, you may render any JavaScript that’s going to insert kinds inside your website. Inside Configuration > Spider, you may go to the Rendering tab and allow this.
This does take slightly longer to crawl the positioning, in fact, however you’ll get kinds which can be rendered client-side by JavaScript in addition to kinds which can be inserted server-side.
Whereas it is a very particular software, it’s an extremely helpful one as you’re working with giant websites. You’ll completely need to audit the place your kinds are embedded all through the positioning.
Obtain Screaming Frog web optimization Spider
Disclosure: Martech Zone is utilizing its affiliate hyperlinks on this article.