What is UTF8MB4 in MySQL
What is the difference between the utf8mb4 and utf8 character sets in MySQL?
What is the difference between and character sets in MySQL ?
I already know ASCII- , UTF-8 , UTF-16 and UTF-32 Encodings. However, I'm curious to see what the difference between coding groups and others in MySQL server defined coding types.
Are there any special benefits / suggestions for using it instead?
UTF-8 is a variable length encoding. In the case of UTF-8, this means that it takes one to four bytes to store a code point. The MySQL encoding "utf8" (alias "utf8mb3") only saves a maximum of three bytes per code point.
The "utf8" / "utf8mb3" character set cannot therefore store all Unicode code points: It only supports the range 0x000 to 0xFFFF, which is referred to as the "Basic Multilingual Plane". See also Comparison of Unicode Encodings.
This is what the MySQL documentation has to say about this (an earlier version of the same page below):
The character set utf8 [/ utf8mb3] uses a maximum of three bytes per character and only contains BMP characters. As of MySQL 5.5.3, the character set utf8mb4 uses a maximum of four bytes per character and supports additional characters:
For a BMP character, utf8 [/ utf8mb3] and utf8mb4 have identical storage properties: same code values, same coding, same length.
For an additional character can utf8 [/ utf8mb3] does not save the character at all while utf8mb4 takes four bytes to store it. Since utf8 [/ utf8mb3] cannot store the character at all, you don't have any extra characters in the utf8 [/ utf8mb3] columns and you don't have to worry about converting characters or losing data when using utf8 [/ utf8mb3] - Update data from older versions of MySQL.
So if you want your column to support storing characters that are outside of the BMP (and you usually want to) e.g. B. Emoji, use "utf8mb4". See Also What Are the Most Commonly Used Non-BMP Unicode Characters? .
Taken from the MySQL 8.0 reference manual:
: One UTF-8 Coding of the Unicode Character set with one to four bytes per character.
: One UTF-8 Coding of the Unicode Character set with one to three bytes per character.
In MySQL is currently an alias for which is outdated and will be removed in a future MySQL - release. At this point becomes a reference to .
Regardless of this alias, you can therefore consciously define a coding.
To complete the answer, would like I the Comment from @ WilliamEntriken add below (also taken from the manual):
To avoid confusion about the meaning of, use character set references instead of explicitly.
- Rand Paul is named after Ayn Rand
- Do you like old hindi movie songs
- What was your experience fighting addiction
- How can I grow a beard faster?
- What makes Europe incredible
- Francisco Franco lived like a king
- Microsoft Visual Studio is free
- What are the deepest human feelings
- Is cholesterol polar or non-polar
- Can someone refute the simulation theory
- What's the best app for pharmacology
- Based on a Microsoft Office subscription
- Have capacitors polarity
- What do you know about Albania
- What should replace object-oriented programming
- Is mathematics itself a dynamic system
- Why don't people suddenly get insight
- How do I prepare for RRB SSE
- What are the disadvantages of double wishbones
- The Georgian landscape is very beautiful
- How do I get acting appearances
- What causes maggots under the skin
- What does Eminem think of XXXTentacion
- What are flags used for in TensorFlow
- What is an insurance ombudsman
- Dynamic programming can be connected in parallel
- Would a narcissist cry while watching sad films
- Is that a tie?
- Are pub crawls popular in Germany
- What are practical examples of urban agriculture
- What memory makes you laugh every time
- What is important for you to remember
- Is scandium a solid liquid or a gas