bytes、str与unicode

时间：2019-01-27 13:43:06 阅读：144 评论：0 收藏：0 [点我收藏+]

1、Python3字符序列的类型

　　bytes -> 原始的8位值（既字节）

　　str -> Unicode字符

2、Python2字符序列的类型

　　str -> 原始的8位值（既字节）

　　unicode -> Unicode字符

即Python3的bytes对应Python2的str，而Python3的str对应Python2的unicode

写代码的时候不要对字符编码做任何的假设。

编写两个辅助函数来进行转换。

接受str或bytes，总是返回str：

def to_str(bytes_or_str):
    if isinstance(bytes_or_str, bytes):
        value = bytes_or_str.decode(‘utf-8‘)
    else:
        value = bytes_or_str
    return value

接受str或bytes，并总是返回bytes：

def to_bytes(bytes_or_str):
    if isinstance(bytes_or_str, str):
        value = bytes_or_str.encode(‘utf-8‘)
    else:
        value = bytes_or_str
    return value

3、在Python3中通过内置的open函数获取文件句柄会默认使用utf-8编码格式来操作文件

如果要写入二进制数据，把encoding参数设为b

按下面的方式来使用open函数

with open(‘path/filename‘, ‘wb‘) as f:
    do something

(读取文件的时候也会有同样的问题，这时候使用‘rb‘）

bytes、str与unicode

原文：https://www.cnblogs.com/walthwang/p/10326123.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)