Skip to content

Invalid utf-8 byte sequences in varchar #252

@chulkilee

Description

@chulkilee

I have an invalid utf-8 string in varchar, and mariaex returns it as binary, not string.

  • mariaex: 0.9.1
  • proxysql: 1.4.12 (severalnines/proxysql:1.4.12 docker image)
  • table: InnoDB / CHARSET=utf8 COLLATE=utf8_unicode_ci
  • column: varchar(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci

Query:

query = "SELECT name, LENGTH(name), BIT_LENGTH(name) FROM my_table WHERE id = 1;"

When talking to ProxySQL:

{:ok, %Mariaex.Result{
    columns: ["name", "HEX(name)", "LENGTH(name)", "BIT_LENGTH(name)"],
    connection_id: #PID<0.458.0>,
    last_insert_id: nil,
    num_rows: 1,
    rows: [[<<199, 224, 226>>, "C387C3A0C3A2", 6, 48]]
  }
}

When talking directly to MySQL:

Mariaex.query(pid, query)

{:ok, %Mariaex.Result{
    columns: ["name", "HEX(name)", "LENGTH(name)", "BIT_LENGTH(name)"],
    connection_id: #PID<0.1256.0>,
    last_insert_id: nil,
    num_rows: 1,
    rows: [["Çàâ", "C387C3A0C3A2", 6, 48]]
  }
}

Examining the value:

str = "Çàâ"
raw = <<199, 224, 226>>

String.valid?(raw)
# => false

byte_size(str)
# => 6

byte_size(raw)
# => 3

to_charlist(str)
# => [199, 224, 226]

Base.decode16("C387C3A0C3A2")
# => {:ok, "Çàâ"}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions