Blog      Products      DotNetWiki      Support      Contact  
     Blog Categories
 - All
 - .NET
 - 4 Word Book Reviews
 - AllPodcasts
 - Business Thoughts
 - Clueless Idiocy
 - Norn Iron
 - Personal
 - Podcasting
 - PowerPack
 - Weird Interweb Stuff
 
     Geoff on Twitter
 
     Local Blogs
 
  ASP.NET PowerPack
The ASP.NET PowerPack contains 28 rich, cross-browser controls including:
 - RichTextBox
 - ComboBox
 - DatePicker
 - No-Repost validator

Try the ASP.NET PowerPack free today!
 - More Info
 - Download
 - Price List
 - Licensing
 - Buy Now!

 
     Web Tools
 - The DotNetWiki
 - OPML Viewer
 - RSS Viewer
 - ASP.NET Colors
 - Base64 Encode
 - Base64 Decode
 - HTML Encode
 - HTML Decode
 - URL Encode
 - URL Decode
 - Crazy IPs
 - Whois

 
     Windows Tools

ADO.NET ConnTest
A simple, free Windows program to test ADO.NET connection strings.

Lines of C#
Ever wanted to know how many lines of C# code are in a file or folder hierarchy?  This free Windows program will tell you.

XmlTools
Free tools to process XML files from the command line.

 
Importing a Wordpress blog into dasBlog

A friend is thinking about moving his blog from the free Wordpress hosted service.  I figured I'd try to persuade him to go to a .NET-based blog, and dasBlog seemed a good candidate.
 
The biggest problem so far has been moving the existing blog entries and comments.  Wordpress does have an export function, but it's not very good.  Instead of outputting something other blog engines can understand, Wordpress has its own 'extended' version of RSS.  Yuk.
 
The good news: There's a replacement 'export.php' file that exports BlogML!  Yay!
 
The bad news: Since his blog is hosted on Wordpress' own server, there's no way to put the replacement export.php file on his site to run it, to get the BlogML.  Boo!
 
So I wrote a program to grab the data from the 'Wordpress eXtended RSS' (you again) file, and use dasBlog's API to import it into the new blog.
 
Surprisingly enough, it worked.  (To be fair though, I just snarfed the code from Scott Hanselman and hacked it to work with Wordpress.)
 
So, here it is in case it ever proves useful to you.  It's not code I'm particularly proud of, but it worked the one and only time I needed to run it.  It might be a useful starting point for someone else.
 

using System;

using System.Xml;

using newtelligence.DasBlog.Runtime;

 

namespace OpinionatedGeek.Applications.WordPressToDasBlog

{

    internal class Program

    {

        private const string ContentNamespace = "http://purl.org/rss/1.0/modules/content/";

        private const string WordpressNamespace = "http://wordpress.org/export/1.0/";

 

        private static int _entryIdCounter = 0;

 

        private static void Main ()

        {

            IBlogDataService dataService = BlogDataServiceFactory.GetService (AppDomain.CurrentDomain.BaseDirectory + "\\content", null);

 

            XmlDocument exported = new XmlDocument ();

            exported.Load (@"..\..\wordpress.xml");

            XmlNamespaceManager namespaces = new XmlNamespaceManager (exported.NameTable);

            namespaces.AddNamespace ("content", ContentNamespace);

            namespaces.AddNamespace ("wp", WordpressNamespace);

 

            XmlNodeList items = exported.SelectNodes ("/rss/channel/item");

            foreach (XmlNode item in items)

            {

                ImportBlogEntry (dataService, namespaces, item);

            }

 

            Console.In.ReadLine ();

 

            return;

        }

 

        private static void ImportBlogEntry (IBlogDataService dataService, XmlNamespaceManager namespaces, XmlNode item)

        {

            DateTime postDate = DateTime.Parse (item ["pubDate"].InnerText);

 

            string blogText = item ["content:encoded"].InnerText;

            string blogTitle = item ["title"].InnerText;

            string guid = item ["guid"].InnerText;

            Entry entry = new Entry ();

            entry.CreatedLocalTime = postDate;

            entry.ModifiedLocalTime = postDate;

            entry.Title = blogTitle;

            entry.Content = blogText.Replace ("\r\n", "
"
);

 

            // There seems to be a problem with dasBlog's entry lookup code.  It HTML encodes the

            // entry ID to do the lookup, but they're stored unencoded (as far as I can tell, which

            // isn't very far).  So, we need to use an entry ID which is the same when HTML encoded

            // and unencoded.  This seems to rule out the normal GUIDs that Wrodpress uses (which

            // are just entry URLs).  Let's keep it simple and use an increasing int counter.

            //entry.EntryId = guid;

            entry.EntryId = (++_entryIdCounter).ToString ();

            string categories = "";

            foreach (XmlNode categoryItem in item.SelectNodes ("category"))

            {

                categories += categoryItem.InnerText + ";";

            }

            categories = categories.Trim (';');

            entry.Categories = categories;

            entry.Author = "Paul";

            entry.AllowComments = true;

            dataService.SaveEntry (entry);

 

            Console.Out.WriteLine ("Title: {0}", blogTitle);

            Console.Out.WriteLine ("Date: {0}", postDate);

            Console.Out.WriteLine ("Categories: {0}", categories);

            Console.Out.WriteLine ("GUID: {0}", guid);

 

            foreach (XmlNode commentNode in item.SelectNodes ("wp:comment", namespaces))

            {

                ImportComment (entry, dataService, commentNode);

            }

 

            return;

        }

 

        private static void ImportComment (Entry entry, IBlogDataService dataService, XmlNode commentNode)

        {

            DateTime commentDate = DateTime.Parse (commentNode ["wp:comment_date"].InnerText);

            string commentText = commentNode ["wp:comment_content"].InnerText;

            string commentAuthorName = commentNode ["wp:comment_author"].InnerText;

            string commentAuthorEmail = commentNode ["wp:comment_author_email"].InnerText;

            string commentAuthorHomepage = commentNode ["wp:comment_author_url"].InnerText;

            string commentAuthorIPAddress = commentNode ["wp:comment_author_IP"].InnerText;

 

            Comment comment = new Comment ();

            comment.CreatedLocalTime = commentDate;

            comment.ModifiedLocalTime = commentDate;

            comment.TargetEntryId = entry.EntryId;

            comment.TargetTitle = entry.Title;

            comment.Author = commentAuthorName;

            comment.AuthorEmail = commentAuthorEmail;

            comment.AuthorHomepage = commentAuthorHomepage;

            comment.AuthorIPAddress = commentAuthorIPAddress;

            comment.Content = commentText;

 

            Console.Out.WriteLine ("Comment Author: {0} ({1})", commentAuthorName, commentAuthorEmail);

            Console.Out.WriteLine ("Comment IP/Home Page: {0} ({1})", commentAuthorIPAddress, commentAuthorHomepage);

            Console.Out.WriteLine ("Comment Date: {0}", commentDate);

            Console.Out.WriteLine ("Comment: {0}", commentText);

 

            dataService.AddComment (comment);

 

            return;

        }

    }

}

 
I can't help thinking it would make a neat PowerShell command...
 
Anyway, hope it's useful, and if it breaks things you get to keep both pieces.


Categories: .NET
Permalink #.Posted by 'geoff' on Friday, 19 October 2007 at 5:30PM


Comments, Trackbacks and Pingbacks

Thanks for the tips
Yep, my friend also had a trouble moving his blog from the free Wordpress hosted service. and I used to help him about it. Everything is the way you described, and some tips are really much useful, I myself did not think about it. You can visit the web source i was using to take tutorials and templates from. I hope I can also help someone.


Posted by 'Dave' on Saturday, 20 February 2010 at 1:21PM

Post A Comment

Your Name:
Your URL (optional):
Comment Title:


View my Technorati Profile.
RSS 2.0 Subscribe to the RSS 2.0 feed for Geoff's Blog.