I was recently tasked with updating all of our product descriptions on the MSDN Marketplace site. The requirements were to make sure that all of the product descriptions matched our website exactly and that all of the links included a query string parameter. Manually qualifying and tagging hundreds HREFs and SRCs every quarter is not my idea of a good time; I'd rather write code! Additionally, the MSDN site has some markup quirks that need to be handled consistently.
Manually qualifying and tagging hundreds HREFs and SRCs every quarter is not my idea of a good time…
Fortunately, our website is based on ASP.NET and uses layouts. This means that by changing the layout page and a little code, I was able to reuse the existing content, adjust for the MSDN quirks, and the task became an error-free copy-paste job.
♻ Recycle Content
We use layout pages on our site to ensure the consistency of all of our content pages. Our layout pages hold the page structure as well as specify regions that are required to be defined in the content pages. The body of the content pages are rendered in a layout page when the RenderBody()
is called. Basically, RenderBody()
renders anything that is in the content page that is not enclosed in a @section{}
. This makes it very easy for all of the members of our team to add new and update existing content without having to worry about forgetting required elements or how a certain page type is structured.
You can read more about layout pages on w3schools.com.
The first thing I did was make a copy of our product layout page. Then, I began stripping down all the extra stuff (e.g., Google Analytics script) that was not needed for the MSDN Marketplace product listings. Next, I added the HTML scaffolding that we decided to use.
At this point, I was ready to process the links and fix the marketplace specific quirks, but I was not sure how. The content is rendered directly in the layout page, so I was not exactly sure where I could modify it. I began digging more into the RenderPage()
function and noticed that it returns a System.Web.WebPages.HelperResult
. Reading the documentation for HelperResult
shows that it is just for the framework's internal use. Since only our team would be using the results, I intrepidly ignored the warnings in the documentation.
I intrepidly ignored the warnings in the documentation
Instead of directly returning the results of RenderBody()
into the layout, I returned the results into a variable. With very little trial and error, I was able to render that variable in the layout with a call to Html.Raw()
.
Here is how:
Layout Page
@{
var bodyHtmlHelper = RenderBody();
var bodyHtml = bodyHtmlHelper.ToHtmlString();
}
<td style="padding: 15px; vertical-align: top;">
@Html.Raw(bodyHtml)
<p>
<a href="@GetLinkToPage()">Get more information at @GetLinkToPage(false)</a>
</p>
</td>
Now that I knew it was possible to get the HTML results from RenderBody()
and still render it in the layout, I was ready to try to update the HTML. (By the way, this same idea can be applied to RenderSection()
. For example, var bannerHtmlHelper = RenderSection("BannerImage", false);
returns the same HtmlHelper
type as RenderBody()
.)
To process the HTML, I decided to install the HtmlAgilityPack NuGet. I created functions to process all of the links and images in the HTML to make them fully qualified. (The functions below only show how to fix root relative href
s and src
s.)
@using HtmlAgilityPack
@functions {
static string FixHtml(string htmlString)
{
var doc = new HtmlDocument();
doc.LoadHtml(htmlString);
doc = FixLinks(doc);
doc = FixImages(doc);
htmlString = doc.DocumentNode.OuterHtml;
return htmlString;
}
static HtmlDocument FixImages(HtmlDocument doc)
{
foreach (var image in doc.DocumentNode.Descendants("img"))
{
var att = image.Attributes["src"];
if (att == null || !att.Value.StartsWith("/"))
continue;
att.Value = "https://www.leadtools.com" + att.Value;
}
return doc;
}
static HtmlDocument FixLinks(HtmlDocument doc)
{
foreach (var link in doc.DocumentNode.Descendants("a"))
{
var att = link.Attributes["href"];
if (att == null || !att.Value.StartsWith("/"))
continue;
var href = "https://www.leadtools.com" + att.Value;
var bookmark = href.IndexOf("#", StringComparison.Ordinal);
var query = href.IndexOf("?", StringComparison.Ordinal);
bookmark = bookmark > -1 ? bookmark : href.Length;
href = query == -1
? href.Insert(bookmark, "?SrcOrigin=VisualStudioGallery")
: href.Insert(query + 1, "SrcOrigin=VisualStudioGallery&");
att.Value = href;
}
return doc;
}
}
In the code above, the HtmlAgilityPack HtmlDocument
is populated by passing the return of the bodyHtmlHelper.ToHtmlString()
method. Once the links and images have been fixed, the HtmlDocument.DocumentNode.OuterHtml
property is returned and rendered back to the client with a call to Html.Raw()
.
This is slightly off topic, but I still think worth mentioning. To streamline copying and pasting the HTML markup, I added a div to wrap the part of the page I wanted to copy, a button, and a little JavaScript to the layout. Thanks StackOverflow! Here is the JavaScript:
function CopyClipboard() {
// creating new textarea element and giveing it id 't'
const t = document.createElement('textarea');
t.id = 't';
t.style.height = 0;
document.body.appendChild(t);
// Copy contentDiv HTML into the text area
t.value = document.getElementById("contentDiv").innerHTML;
// Now copy whatever inside the textarea to clipboard
const selector = document.querySelector('#t');
selector.select();
document.execCommand('copy');
// Remove the textarea
document.body.removeChild(t);
}
Content Pages
The content pages only required one change so that the new layout is used instead of our standard layout.
This:
@{
Layout = "~/layouts/_Product.cshtml";
}
Became this:
@{
Layout = "~/layouts/_Product.cshtml";
if (HttpContext.Current.IsDebuggingEnabled)
{
if (HttpContext.Current.Request.QueryString["VSG"] != null)
{
Layout = "~/layouts/_ProductVSG.cshtml";
}
}
}
Because this is only for internal usage, the code checks to see if the current context is debug (web.config on production servers is always release) and also checks to see if there is a specific query string parameter present before changing the layout.
Same Content, Different Layout
Since the process is streamlined, we have added a tickler to update the content on the MSDN marketplace on a regular basis. Check out an example of the results:
Thanks to ASP.NET and layout pages, what once took days of tedious and error prone HTML editing now takes less than 15 minutes of copy and paste.