Comments on: Character sets: latin1 vs. ascii https://shlomi-noach.github.io/blog/mysql/character-sets-latin1-vs-ascii Blog by Shlomi Noach Thu, 09 Jul 2009 04:43:58 +0000 hourly 1 https://wordpress.org/?v=5.3.3 By: shlomi https://shlomi-noach.github.io/blog/mysql/character-sets-latin1-vs-ascii/comment-page-1#comment-2578 Thu, 09 Jul 2009 04:43:58 +0000 https://shlomi-noach.github.io/blog/?p=828#comment-2578 Hi Brian,

Somehow I’m not surprised. You guys take the good stuff and throw away the rest!

Shlomi

]]>
By: Brian Aker https://shlomi-noach.github.io/blog/mysql/character-sets-latin1-vs-ascii/comment-page-1#comment-2570 Wed, 08 Jul 2009 15:38:19 +0000 https://shlomi-noach.github.io/blog/?p=828#comment-2570 Hi!

In Drizzle we made utf8 the default and optimized around it (the default collatin utf8_general_ci). For anything else? Just use binary.

Cheers,
-Brian

]]>
By: Mchl https://shlomi-noach.github.io/blog/mysql/character-sets-latin1-vs-ascii/comment-page-1#comment-2569 Wed, 08 Jul 2009 12:07:57 +0000 https://shlomi-noach.github.io/blog/?p=828#comment-2569 Yeah. I forgot how VARCHAR behaves in MEMORY for a moment.
It gets tricky indeed 😉

Personally I use case insensitive collations more often (for user supplied data at least).

]]>
By: shlomi https://shlomi-noach.github.io/blog/mysql/character-sets-latin1-vs-ascii/comment-page-1#comment-2567 Wed, 08 Jul 2009 10:34:23 +0000 https://shlomi-noach.github.io/blog/?p=828#comment-2567 Mchl,

Just as another example, we can define a VARCHAR, utf8 column on a MEMORY table.
I wasn’t asking for fixed width – but MySQL/MEMORY made it so.

Regards

]]>
By: shlomi https://shlomi-noach.github.io/blog/mysql/character-sets-latin1-vs-ascii/comment-page-1#comment-2565 Wed, 08 Jul 2009 10:09:59 +0000 https://shlomi-noach.github.io/blog/?p=828#comment-2565 hartmut,

Thanks, I think we both agree here.
I saw need to mention that because the misconception that utf8 columns will always require only as much storage as needed – is widespread.
So the notion of “you asked for a fixed size column” is not clear to some.

I hope this clarifies.
Regards

]]>
By: hartmut https://shlomi-noach.github.io/blog/mysql/character-sets-latin1-vs-ascii/comment-page-1#comment-2564 Wed, 08 Jul 2009 09:47:10 +0000 https://shlomi-noach.github.io/blog/?p=828#comment-2564 > For example, if you have CHAR(10) CHARSET utf8, then each such value will take exactly 30 bytes, regardless of content

well, you asked for a fixed size column, so you got a fixed size column, and as it is fixed size it needs to be big enough to store 10 3 byte utf8 sequences up front

]]>
By: shlomi https://shlomi-noach.github.io/blog/mysql/character-sets-latin1-vs-ascii/comment-page-1#comment-2563 Wed, 08 Jul 2009 08:38:44 +0000 https://shlomi-noach.github.io/blog/?p=828#comment-2563 Thanks for the correction; I’ve updated the text.

I have the opinion that collations should be case sensitive by default; this makes for faster comparisons.

utf8 encodes ASCII as single character – true; by MySQL and its engines do not necessarily follow. For example, if you have CHAR(10) CHARSET utf8, then each such value will take exactly 30 bytes, regardless of content. See also: MySQL’s character sets and collations demystified

]]>
By: Mchl https://shlomi-noach.github.io/blog/mysql/character-sets-latin1-vs-ascii/comment-page-1#comment-2562 Wed, 08 Jul 2009 08:16:05 +0000 https://shlomi-noach.github.io/blog/?p=828#comment-2562 Latin1 covers Western European languages. Central Europe is covered by Latin2 CP. 😉

I agree though, utf8 should be introduced as a default encoding, and utf8_general_ci as default collation. AFAIK utf8 stores ASCII characters as single byte values.

]]>