String
Code
1 Protected Function GetAsciiString()Function GetAsciiString(ByVal Buffer As Byte(), ByVal SourceEncoding As Encoding) As String
2
3 '' Create two different encodings.
4 Dim TargetEncoding As Encoding = Encoding.ASCII
5
6 '' Perform the conversion from one encoding to the other.
7 Dim asciiBytes As Byte() = Encoding.Convert(SourceEncoding, TargetEncoding, Buffer)
8
9 '' Convert the new byte into an ascii string.
10 Dim asciiString As String = System.Text.Encoding.ASCII.GetString(asciiBytes)
11
12 Return asciiString
13 End Function
14
根据不同的编码方式,传入不同的参数:
Code
1 Dim strScript As String = ""
2 Select Case sqlFile.Encoding
3 Case PaTextEncoding.UTF16LittleEndian
4 strScript = GetAsciiString(sqlFile.Buffer, System.Text.Encoding.Unicode) ''System.Text.Encoding.Unicode.GetString(sqlFile.Buffer)
5 Case PaTextEncoding.UTF16BigEndian
6 strScript = GetAsciiString(sqlFile.Buffer, System.Text.Encoding.BigEndianUnicode) ''System.Text.Encoding.BigEndianUnicode.GetString(sqlFile.Buffer)
7 Case PaTextEncoding.UTF8
8 strScript = GetAsciiString(sqlFile.Buffer, System.Text.Encoding.UTF8) ''System.Text.Encoding.UTF8.GetString(sqlFile.Buffer)
9 Case PaTextEncoding.UTF7
10 strScript = GetAsciiString(sqlFile.Buffer, System.Text.Encoding.UTF7) ''System.Text.Encoding.UTF7.GetString(sqlFile.Buffer)
11 Case PaTextEncoding.Unknown
12 Throw New Exception(String.Format(SQL_UnknownFile, sqlFile.Name))
13 End Select
14
15 ''This check needs to be included because the unicode Byte Order mark results in an extra character at the start of the file
16 ''The extra character - ''?'' - causes an error with the database.
17 If strScript.StartsWith("?") Then
18 strScript = strScript.Substring(1)
19 End If
20
最后的一点问题
DNN里这种避免BOM影响解码的方法有一个问题,那就是它把所有的文件都转为ASCII编码,而ASCII编码是不支持双字节的,也就是说如果文件中包含中文,中文在解码后就成为乱码了。具体现象可以参考这个文章;SQL SERVER 2005 EXPRESS与ASP.net出现中文变成问号的奇怪问题。很可能不是通常的utf-8编码问题。
我想解决方案是,把所有的文件都转为UTF编码,针对BOM影响编码的问题,使用UTF8Encoding.GetString(buffer, 3, buffer.length)跳过字节数组的前三个字节。