Some things about the ASCII,Unicode and UTF-8

2024-04-20 09:32
文章标签 ascii utf unicode things

本文主要是介绍Some things about the ASCII,Unicode and UTF-8,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

  • ASCII: 1 byte,just can cover 255 characters,usually used in pure English text.
  • Unicode:ASCII can’t cover many other type’s words. Unicode uniform all of the words to the same rule.Usually it is consist of 2 bytes and sometimes more
  • UTF-8:
    – Unicode will cost much memory and usually Unicode will cost 2 times memory compared with ASCII.So UTF-8 was created.UTF-8 will code the charactor to 1~6 bytes depend on the charactor.For example,English word will be 1 byte and a chinese word usually will be 3 bytes.
    – UTF-8 can also suppot the ASCII.So some old softwares that just can support the ASCII also can work with the UTF-8
    – the PC’s memory will store the data with Unicode in order to support all of the words and will be stored with UTF-8 in the desk
    – If you are doing work in the notepad,the words will be the Unicode.Then you save the file to the disk,the data will be transformed to UTF-8
    – the C/S also do the same work.when you are watching the web.the server will create the web page with Unicode,and change it to UTF-8 when sending the data to the client.The client also return the data to the Unicode and show the page on the browser.

这篇关于Some things about the ASCII,Unicode and UTF-8的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/919884

相关文章

C++ | Leetcode C++题解之第393题UTF-8编码验证

题目: 题解: class Solution {public:static const int MASK1 = 1 << 7;static const int MASK2 = (1 << 7) + (1 << 6);bool isValid(int num) {return (num & MASK2) == MASK1;}int getBytes(int num) {if ((num &

C语言 | Leetcode C语言题解之第393题UTF-8编码验证

题目: 题解: static const int MASK1 = 1 << 7;static const int MASK2 = (1 << 7) + (1 << 6);bool isValid(int num) {return (num & MASK2) == MASK1;}int getBytes(int num) {if ((num & MASK1) == 0) {return

在Unity环境中使用UTF-8编码

为什么要讨论这个问题         为了避免乱码和更好的跨平台         我刚开始开发时是使用VS开发,Unity自身默认使用UTF-8 without BOM格式,但是在Unity中创建一个脚本,使用VS打开,VS自身默认使用GB2312(它应该是对应了你电脑的window版本默认选取了国标编码,或者是因为一些其他的原因)读取脚本,默认是看不到在VS中的编码格式,下面我介绍一种简单快

【python 编码问题】UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-4: ordinal not

插入oracle 数据发生 错误:UnicodeEncodeError: 'ascii' codec can't encode characters in position 131-136: ordinal not in range(128) 先说解决办法: python2.7版本,在开头加入下面语句 import sysreload(sys)sys.setdefaultencoding

1字节的UTF-8序列的字节1无效

使用DOMReader解析XML文档时候报错”1字节的UTF-8序列的字节1无效”,我这里的解决方法。 1.手动将< ? xml version=”1.0” encoding=”UTF-8”?>中的UTF-8更改成UTF8,这样就可以了。 2.使用文本编译器把xml文档改成以UTF8无BOM编码格式就可以了。

页面jsp编码utf-8,传递中文参数到java后台出现乱码

1、前台页面jsp的编码是contentType=”text/html; charset=utf-8” 后台编码是gdk,传递中文参数时出现乱码,后台接收到传递的参数时需要进行转换才能解决乱码问题。 new String(this.getParameter("teacherName").getBytes("iso-8859-1"),"utf-8") 2、google浏览器显示正常,但是IE浏

(转)mysql按字段排序 按照字段的数值大小排序,而非 ascii码排序

参考:http://www.cnblogs.com/codefly-sun/p/5898738.html     如果是varchar类型, 排序后是这样的: 就是对mysql数值字符串类型进行排序,在默认情况下使用order by 字段名称 desc/asc 进行排序的时候,mysql进行的排序规则是按照ASCII码进行排序的,并不会自动的识别出这些数据是数值   ,百度了一下,

Android 打开 GBK项目如何设置成UTF-8

1.标题 今天打开一个eclipse老项目,编码格式为GBK,Android studio导入项目报错,本人想到一个方案就是批量修改文件格式从 GBK到 UTF-8,这样可以一键解决问题 2.开发脚本 使用前请备份代码   使用前请备份代码   使用前请备份代码 脚本代码如下,保存到文件下为 shell.ps1 # 获取当前脚本的所在目录$folderPath = Get-Loca

Golang | Leetcode Golang题解之第393题UTF-8编码验证

题目: 题解: const mask1, mask2 = 1 << 7, 1<<7 | 1<<6func getBytes(num int) int {if num&mask1 == 0 {return 1}n := 0for mask := mask1; num&mask != 0; mask >>= 1 {n++if n > 4 {return -1}}if n >= 2 {retur