jeffwhitledge.com

CP-1250 to UTF-8 Conversion

A long time ago I mentioned in my blog that I had developed a CP-1250 to UTF-8 converter in Visual Basic 6. This entry has generated more email than all of my other entries combined! So I've decided, for your convienience, to post the source code here. Enjoy!

Visual Basic .Net version

I've been doing stuff in Visual Basic .Net lately. This kind of thing is a lot easier to do in .Net. Here is the code to write a UTF-8 file in VB.Net (the codepoint translation portion is omitted, but it's a simple select statment, the complexity in the previous program was all the bit manipulation necessary to produce UTF-8):

Imports System.IO
Module Module1
    Sub Main()
	'Converts an ASCII file to UTF-8.
	'The user inputs the source and destination file path names.

        Dim OriginalFilePath As String
        Dim NewFilePath As String

        OriginalFilePath = InputBox("What file do you wish to convert?")
        NewFilePath = InputBox("What is path/name of the output file?")

        Dim OriginalFileReader As New StreamReader(OriginalFilePath, _
		System.Text.Encoding.ASCII)
        Dim NewFileWriter As New StreamWriter(NewFilePath, False, _
		System.Text.Encoding.UTF8)

        Do
            Dim NextLine As String
            NextLine = OriginalFileReader.ReadLine()
            If NextLine Is Nothing Then Exit Do
            NewFileWriter.WriteLine(NextLine)
        Loop

        OriginalFileReader.Close()
        NewFileWriter.Close()
    End Sub
End Module